Job Description
Job Title: Machine Learning Engineering Manager, Safeguards. Location: San Francisco, CA
Responsibilities:
- Define and execute a vision for safeguarding Anthropic's AI systems through machine learning-based detection and enforcement.
- Lead and mentor a team of ML and software engineers in building scalable, protective AI technologies.
- Design and deploy solutions to identify harmful usage and measure real-world harms, both qualitatively and quantitatively;
- Collaborate with Product, Policy, and Enforcement stakeholders to define risk vectors and respond to evolving adversarial threats.
- Drive interdisciplinary collaborations across research, engineering, and policy to align on safety objectives;
- Foster team development through coaching, feedback, and best practices in technical management.
Qualifications:
- 5+ years of experience leading technical teams in ML-intensive environments.
- 5+ years of hands-on work in trust & safety, risk detection, or anti-fraud systems involving applied ML.
- Deep familiarity with adversarial content detection and misuse prevention;
- Strong communication skills to convey complex ML safety strategies to diverse audiences;
- Proven ability to manage cross-functional initiatives and balance multiple priorities.
Preferred:
- Experience deploying ML models for platform safety in production environments.
- Knowledge of emerging adversarial techniques and online threat vectors;
- Passion for building responsible and beneficial AI systems;
- Experience leading teams during high growth and organizational change.
Education:
Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
Benefits and Information:
- Salary: $340,000 - $425,000 USD (annual)
- Work Arrangement: Hybrid, with at least 25% on-site presence in our San Francisco office
- Visa Sponsorship: Available (case-by-case basis, with dedicated immigration legal support);
- Application Review: Rolling basis — no fixed deadline;
- Anthropic is committed to inclusive hiring practices and encourages candidates from all backgrounds to apply.
Important: To avoid application spam, include this statement at the end of your resume or application: 'I found this position on ( AI Safety Jobs USA ) .' Applications without it will be disqualified.
LOCATION
JOB TYPE
Full-timeCATEGORY
Secure and Safe Deployment JobsCOMPENSATION
$340k - $425k
Important: To avoid application spam, include this statement at the end of your resume or application: 'I found this position on ( AI Safety Jobs USA ) .' Applications without it will be disqualified.
Back to all AI Safety jobs