Safeguards Policy Analyst

Anthropic

Job Description

Job Title: Safeguards Policy Analyst. Location: San Francisco, CA | New York City, NY

Responsibilities:

Design scalable, accurate enforcement workflows and automated systems to identify and respond to harmful AI use.
Collaborate with Product, Engineering, and Data Science teams to improve detection models and policy violation response systems.
Conduct manual reviews of flagged content to refine enforcement standards.
Engage with external experts to enhance policy frameworks and harm reduction strategies.
Identify policy gaps and recommend solutions based on real-world enforcement cases;
Stay current with best practices in AI policy enforcement and content moderation;
Support policy development through research, enforcement feedback, and user behavior analysis.

Qualifications:

Experience building and scaling policy enforcement systems and content review workflows.
Strong policy writing and editing skills for technology platforms.
Proficiency with SQL or other data analysis tools to interpret large datasets.
Familiarity with risks in elections, influence operations, fraud, or abuse scenarios;
Understanding of challenges in implementing policy at scale within content moderation and AI systems.
Experience using generative AI tools and crafting enforcement-focused prompts;
Strong communication skills with the ability to collaborate across product, legal, and engineering teams.
Bachelor’s degree or equivalent experience in a related field is required.

Preferred:

Background in trust & safety, integrity operations, or harm mitigation.
Experience handling sensitive or explicit content in a professional setting.

Benefits & Information:

Competitive annual salary: $170,000 – $200,000 USD
Hybrid work model: Minimum 25% in-office presence in SF or NYC;
Visa sponsorship available (evaluated case-by-case);
Applications reviewed on a rolling basis — no deadline

Apply now Apply now

Important: To avoid application spam, include this statement at the end of your resume or application: 'I found this position on ( AI Safety Jobs USA ) .' Applications without it will be disqualified.

LOCATION

JOB TYPE