Machine Learning Engineer, Safeguards

Anthropic

Job Description

Job Title: Machine Learning Engineer – Safeguards. Location: San Francisco, CA or New York City, NY

Responsibilities:

Develop and deploy machine learning models to detect anomalous or harmful user behavior.
Enhance automated systems for abuse detection and enforcement.
Analyze reports of inappropriate activity and proactively build classifiers to detect similar behaviors.
Collaborate with research teams to identify and mitigate safety risks during model training.
Surface emerging abuse patterns and contribute to system improvements in real time.

Qualifications:

4+ years in machine learning engineering or applied research, preferably in trust and safety.
Strong programming skills in Python and SQL, with experience in data mining and analysis;
Demonstrated experience building behavioral classifiers and anomaly detection systems;
Excellent communication skills, especially when explaining technical topics to diverse stakeholders.
Passion for AI safety and the societal implications of advanced AI systems.

Preferred:

Proficiency with machine learning frameworks such as PyTorch, TensorFlow, or Scikit-Learn.
Familiarity with transformer-based language models and reinforcement learning techniques;
Experience with scalable infrastructure for training and deploying large ML systems.

Benefits & Information:

Salary Range: $340,000 – $425,000 USD annually.
Work Environment: Hybrid work model with at least 25% in-office attendance;
Visa Sponsorship: Available for select candidates with legal support provided;
Education Requirement: Bachelor’s degree in a related field or equivalent professional experience.

We strongly encourage candidates from underrepresented backgrounds to apply, even if they don't meet every listed qualification. Anthropic values diverse perspectives and believes they are essential to the development of safe and beneficial AI.

Apply now Apply now

Important: To avoid application spam, include this statement at the end of your resume or application: 'I found this position on ( AI Safety Jobs USA ) .' Applications without it will be disqualified.

LOCATION

JOB TYPE