AI Insights
OpenAI

Engineering Manager, Core Services

OpenAI · San Francisco, California, US
full-timesenior (8-20 yrs)Posted 99d ago
Engineering LeadershipM2ManagementOn-siteRelocation
StackDistributed SystemsInfrastructure EngineeringCluster ManagementWorkflow OrchestrationBlob StorageObject StorageFile SystemsSLOsIncident ResponseCapacity PlanningCloud InfrastructureMulti-cloudPlatform EngineeringService ReliabilityOperational Rigor

Summary

OpenAI is hiring an Engineering Manager to lead high-performing teams building and operating mission-critical distributed systems, platform foundations, and large-scale storage/blob/file infrastructure that serve as the production backbone for all OpenAI products.

About the role

About the Team

The Core Services organization builds and runs the mission-critical online services that product teams rely on in production. We own foundational distributed systems and platform capabilities that enable reliable execution, high-performance services, and large-scale file/data needs across our products. This team is distinct from developer infrastructure and data infrastructure—our focus is production service foundations and core runtime services.

About the Role

We’re hiring an Engineering Manager, Core Services to help lead teams responsible for highly reliable, high-scale distributed systems that sit on the critical path for OpenAI products. Your team will own foundational production systems that OpenAI’s product engineering teams build on. You’ll collaborate closely with product and infrastructure partners to ship reliable services quickly, and help scale systems and teams as OpenAI grows. You’ll partner closely with senior engineering leaders to scale the org, mature operations, and drive major platform initiatives. This role requires strong technical ability.

You’ll be responsible for:

  • Managing and growing a high-performing, team of infrastructure engineers.

  • Leading teams building and operating large, critical production platforms, including cluster reliability, scaling, and rollout safety.

  • Building and operating mission-critical distributed systems with strong operational rigor (SLOs, incident response, capacity planning, reliability).

  • Setting technical direction for platform foundations such as workflow/orchestration capabilities, large-scale file/blob/storage services, and core service foundations.

  • Partnering with a broad set of stakeholders, including product engineering, adjacent infrastructure teams, and (where relevant) finance/cost partners.

  • Coaching, mentoring, and developing engineers and emerging leaders.

You might thrive in this role if you:

  • Have significant experience leading teams that run mission-critical infrastructure in production.

  • Have experience operating mission-critical services, or core distributed systems building blocks.

  • Have built platform-like systems (e.g., orchestration/workflow execution, service platforms) and/or large-scale storage/blob/file infrastructure.

  • Have experience building systems spanning multiple cloud environments.

  • Take pride in building scalable, reliable systems and improving operational health.

  • Own problems end-to-end and can move fast in environments with ambiguity and competing priorities.

Workplace & Location

  • This role is based in our San Francisco HQ. We offer relocation assistance for new employees.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. 

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

What you'll do

1Manage and grow a high-performing team of infrastructure engineers
2Lead teams building and operating large, critical production platforms including cluster reliability, scaling, and rollout safety
3Build and operate mission-critical distributed systems with strong operational rigor (SLOs, incident response, capacity planning)
4Set technical direction for platform foundations including workflow/orchestration capabilities, large-scale file/blob/storage services, and core service foundations
5Partner with a broad set of stakeholders including product engineering, adjacent infrastructure teams, and finance/cost partners
6Coach, mentor, and develop engineers and emerging leaders
7Collaborate with senior engineering leaders to scale the org, mature operations, and drive major platform initiatives

Requirements

Significant experience leading teams operating mission-critical distributed infrastructure in production at scale
Hands-on background building platform-like systems such as orchestration/workflow engines, service platforms, or large-scale storage/blob/file infrastructure
Strong operational rigor including SLO definition, incident response, capacity planning, and rollout safety
Experience designing and operating systems spanning multiple cloud environments
Demonstrated ability to manage, coach, and grow senior engineers and emerging engineering leaders

Nice to have

Kubernetes
Workflow Engines (e.g. Temporal, Airflow)
Cloud Platforms (AWS, GCP, Azure)
Cost Engineering
Service Mesh
High-availability Systems Design

Role overview

Role family
Engineering Leadership
Level
M2 — platform
Experience
8–20 years
Type
Management
Remote policy
On-site
Visa sponsorship
Not offered

Tech stack analysis

FRAMEWORKS
Workflow Orchestration Engines
DATABASES
Blob StorageObject StorageLarge-scale File Systems
INFRASTRUCTURE
Multi-cloud (AWS / GCP / Azure)Cluster ManagementService PlatformsDistributed Systems
TOOLS
SLO ToolingIncident ManagementCapacity Planning Tools

Salary estimate

$380K – $600K
AI-estimated salary range
Confidence75%
Reasoning

OpenAI is one of the highest-paying tech employers globally. An Engineering Manager at the Director level in San Francisco overseeing mission-critical infrastructure typically commands a total compensation between $380K–$600K+. This includes a base salary of ~$250K–$330K plus equity (RSUs or profit participation units) and performance bonuses. OpenAI's compensation packages are known to be significantly above market due to the company's valuation and talent competition with Google DeepMind, Meta, and other top AI labs.

See the AI-estimated salary range for this role

Sign up free →

Green flags

5 items
OpenAI is one of the fastest-growing and most strategically important AI companies globally — exceptional career trajectory opportunitygrowth

Discover all 5 green flags for this role

Sign up free →

Benefits breakdown

See all benefits organized by category — health, financial, time off & more

Sign up free →

Hiring insights

JD quality
7/10
Urgency
high
Autonomy
high
Team size
medium (5-15)

See JD quality score, hiring urgency & team details

Sign up free →

Red flags

PRO4 items
No salary or compensation range disclosed in the posting, reducing transparency for candidatescompensation

See all 4 red flags — what the JD isn't telling you

Sign up free →

Interview insights

PRO
Rounds
6
Duration
4 wks
Difficulty
very hard
Take-home
No

Get full interview breakdown — rounds, likely topics & prep tips

Sign up free →

Career path

PRO
Next roles
Director of Engineering, InfrastructureVP of EngineeringSVP of Platform Engineering

See where this role leads — full career progression

Sign up free →
About the company

OpenAI is the AI research laboratory behind GPT-4, ChatGPT, DALL-E, and the Codex API. With over 200 million weekly active ChatGPT users, OpenAI is at the forefront of large language model development and deployment. The company pursues a mission of building safe artificial general intelligence that benefits all of humanity.

HQSan Francisco, CA, USA
Interview difficultyvery hard
Build vs Maintainboth
Cross-functionalYes