GatherJob
Back to jobs
OpenAI

Tech Lead, Deployment & Operations — Custom Infrastructure

OpenAI
Remote Today

About the Team

OpenAI’s Hardware organization develops AI-native silicon and system-level solutions for the unique demands of advanced AI workloads. Building on efforts like Jalapeño, the team is developing future generations of AI-native silicon and tightly integrated systems to power the next generation of frontier models. By co-designing chips, systems, tools, and methodologies, the team helps deliver faster, more efficient, and production-ready hardware for OpenAI’s supercomputing platform.

About the Role

We are seeking a Technical Lead to lead deployment and operations for OpenAI’s Silicon & Systems team. This person will become the Directly-Responsible Individual responsible for bringing OpenAI’s custom silicon and associated systems into data center environments, ensuring successful deployment, bring-up, validation, operational readiness, and ongoing reliability at scale.

This role sits at the intersection of silicon, systems, infrastructure, data center operations, and software. You will lead a team focused on taking new hardware platforms from lab validation into production data center deployment. You will be responsible for building the operational processes, technical workflows, tooling, and cross-functional alignment required to deploy and operate custom AI hardware reliably in OpenAI’s supercomputing infrastructure.

The ideal candidate is both a strong leader and a deeply technical operator. You should be comfortable staying close to the technical details of hardware bring-up, fleet deployment, debugging, system validation, data center integration, and production operations. This role requires strong execution, excellent cross-functional judgment, and the ability to drive clarity in ambiguous, fast-moving environments.

In this role, you will:

  • Lead a team responsible for deployment and operations of OpenAI’s custom silicon and systems in data center environments

  • Own the path from hardware bring-up and validation through production deployment, operational readiness, and sustained fleet support

  • Partner closely with silicon, systems, software, infrastructure, networking, data center, supply chain, and external partner teams to ensure successful deployment at scale

  • Define deployment processes, operational playbooks, technical readiness criteria, escalation paths, and reliability practices for new hardware platforms

  • Drive cross-functional execution across lab bring-up, rack/system integration, data center deployment, fleet monitoring, debugging, and issue resolution

  • Stay hands-on tech

Apply now

Opens the company's application page

About the company

OpenAI

OpenAI

AI research and deployment company.