AI Insights

Engineering Manager, Core Services

OpenAI · San Francisco, California, US

full-timesenior (8-20 yrs)Posted 99d ago

Engineering LeadershipM2ManagementOn-siteRelocation

StackDistributed SystemsInfrastructure EngineeringCluster ManagementWorkflow OrchestrationBlob StorageObject StorageFile SystemsSLOsIncident ResponseCapacity PlanningCloud InfrastructureMulti-cloudPlatform EngineeringService ReliabilityOperational Rigor

Summary

OpenAI is hiring an Engineering Manager to lead high-performing teams building and operating mission-critical distributed systems, platform foundations, and large-scale storage/blob/file infrastructure that serve as the production backbone for all OpenAI products.

About the role

About the Team

The Core Services organization builds and runs the mission-critical online services that product teams rely on in production. We own foundational distributed systems and platform capabilities that enable reliable execution, high-performance services, and large-scale file/data needs across our products. This team is distinct from developer infrastructure and data infrastructure—our focus is production service foundations and core runtime services.

About the Role

We’re hiring an Engineering Manager, Core Services to help lead teams responsible for highly reliable, high-scale distributed systems that sit on the critical path for OpenAI products. Your team will own foundational production systems that OpenAI’s product engineering teams build on. You’ll collaborate closely with product and infrastructure partners to ship reliable services quickly, and help scale systems and teams as OpenAI grows. You’ll partner closely with senior engineering leaders to scale the org, mature operations, and drive major platform initiatives. This role requires strong technical ability.

You’ll be responsible for:

Managing and growing a high-performing, team of infrastructure engineers.
Leading teams building and operating large, critical production platforms, including cluster reliability, scaling, and rollout safety.
Building and operating mission-critical distributed systems with strong operational rigor (SLOs, incident response, capacity planning, reliability).
Setting technical direction for platform foundations such as workflow/orchestration capabilities, large-scale file/blob/storage services, and core service foundations.
Partnering with a broad set of stakeholders, including product engineering, adjacent infrastructure teams, and (where relevant) finance/cost partners.
Coaching, mentoring, and developing engineers and emerging leaders.

You might thrive in this role if you:

Have significant experience leading teams that run mission-critical infrastructure in production.
Have experience operating mission-critical services, or core distributed systems building blocks.
Have built platform-like systems (e.g., orchestration/workflow execution, service platforms) and/or large-scale storage/blob/file infrastructure.
Have experience building systems spanning multiple cloud environments.
Take pride in building scalable, reliable systems and improving operational health.
Own problems end-to-end and can move fast in environments with ambiguity and competing priorities.

Workplace & Location

This role is based in our San Francisco HQ. We offer relocation assistance for new employees.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

What you'll do

1Manage and grow a high-performing team of infrastructure engineers

2Lead teams building and operating large, critical production platforms including cluster reliability, scaling, and rollout safety

3Build and operate mission-critical distributed systems with strong operational rigor (SLOs, incident response, capacity planning)

4Set technical direction for platform foundations including workflow/orchestration capabilities, large-scale file/blob/storage services, and core service foundations

5Partner with a broad set of stakeholders including product engineering, adjacent infrastructure teams, and finance/cost partners

6Coach, mentor, and develop engineers and emerging leaders

7Collaborate with senior engineering leaders to scale the org, mature operations, and drive major platform initiatives

Requirements

■Significant experience leading teams operating mission-critical distributed infrastructure in production at scale

■Hands-on background building platform-like systems such as orchestration/workflow engines, service platforms, or large-scale storage/blob/file infrastructure

■Strong operational rigor including SLO definition, incident response, capacity planning, and rollout safety

■Experience designing and operating systems spanning multiple cloud environments

■Demonstrated ability to manage, coach, and grow senior engineers and emerging engineering leaders

Nice to have

■Kubernetes

■Workflow Engines (e.g. Temporal, Airflow)

■Cloud Platforms (AWS, GCP, Azure)

■Cost Engineering

■Service Mesh

■High-availability Systems Design

Role overview

Role family

Engineering Leadership

Level

M2 — platform

Experience

8–20 years

Type

Management

Remote policy

On-site

Visa sponsorship

Not offered

Tech stack analysis

FRAMEWORKS

Workflow Orchestration Engines

DATABASES

Blob StorageObject StorageLarge-scale File Systems

INFRASTRUCTURE

Multi-cloud (AWS / GCP / Azure)Cluster ManagementService PlatformsDistributed Systems

TOOLS

SLO ToolingIncident ManagementCapacity Planning Tools

Salary estimate

$380K – $600K

AI-estimated salary range

Confidence75%

Reasoning

OpenAI is one of the highest-paying tech employers globally. An Engineering Manager at the Director level in San Francisco overseeing mission-critical infrastructure typically commands a total compensation between $380K–$600K+. This includes a base salary of ~$250K–$330K plus equity (RSUs or profit participation units) and performance bonuses. OpenAI's compensation packages are known to be significantly above market due to the company's valuation and talent competition with Google DeepMind, Meta, and other top AI labs.

See the AI-estimated salary range for this role

Green flags

5 items

■OpenAI is one of the fastest-growing and most strategically important AI companies globally — exceptional career trajectory opportunitygrowth

Discover all 5 green flags for this role

Benefits breakdown

See all benefits organized by category — health, financial, time off & more

Hiring insights

JD quality

7/10

Urgency

high

Autonomy

high

Team size

medium (5-15)

See JD quality score, hiring urgency & team details

Red flags

PRO4 items

■No salary or compensation range disclosed in the posting, reducing transparency for candidatescompensation

See all 4 red flags — what the JD isn't telling you

Interview insights

PRO

Rounds

Duration

4 wks

Difficulty

very hard

Take-home

Get full interview breakdown — rounds, likely topics & prep tips

Career path

PRO

Next roles

Director of Engineering, InfrastructureVP of EngineeringSVP of Platform Engineering

See where this role leads — full career progression

About the company

OpenAI

www.openai.com

OpenAI is the AI research laboratory behind GPT-4, ChatGPT, DALL-E, and the Codex API. With over 200 million weekly active ChatGPT users, OpenAI is at the forefront of large language model development and deployment. The company pursues a mission of building safe artificial general intelligence that benefits all of humanity.

HQSan Francisco, CA, USA

Interview difficultyvery hard

Build vs Maintainboth

Cross-functionalYes

Similar roles