About PlayerZero
At PlayerZero, we are building the first Engineering World Model: a unified system that explains, predicts, and resolves production failures autonomously.
Our mission is to move beyond the current landscape of "reactive tools" to create a new class of self-healing software where AI is an active participant in decision-making. By synthesizing codebases, tickets, and production telemetry, we've built a system that jointly represents intended behavior and runtime reality — and learns from where they diverge. That divergence is where production failures live. We're building the system that learns from every one of those moments. If that problem excites you, we should talk.
This role is a hybrid position based out of our San Francisco office. All candidates must be located in the Bay Area to be considered. Relocation assistance is not available for this position.
About the AI team
The AI team at PlayerZero is a lean, high-velocity group owning everything from agent architecture and model selection to evaluation infrastructure and reliability. We ship working systems over theoretical elegance, but we build with rigor — systematic evaluation, clear ownership, production-grade reliability.
Overview
Your focus is the World Model, our core representation of how code behaves, what breaks in production, how failures relate to changes, and whose attention to prioritize. You'll design the abstractions that let our AI systems navigate the gap between intended and actual behavior, and get better at it with every incident they process. This means designing entity representations that link issues, code, PRs, traces, and people; building memory systems that accumulate institutional knowledge; and researching how to surface the right context at the right time.
Responsibilities:
Architect the Graph: Design entity representations linking code, PRs, telemetry, tickets, and developer behavior into a unified graph
Build Institutional Memory: Develop memory and intent structures that accumulate resolution patterns to capture how your best engineers actually debug and decide, so that knowledge compounds across every future incident
Improve Context Prioritization: Research and implement retrieval strategies that surface the right context for a given production task — what code, what history, what people, in what order
Surface Implicit Signals: Prototype methods for capturing latent patterns (codebase navigation, issue clustering, resolution behavior, etc.) to sharpen our agents’ accuracy over time
Ship It: Collaborate with Applied AI engineers to move research into production
Qualifications:
Must haves:
MS or PhD in ML/AI, or 3+ years of research experience in knowledge representation, model training, or related areas
Experience with fine-tuning, knowledge graphs, embedding models, and retrieval systems where relevance is a moving target
Strong Python skills; able to move from a research problem to a working prototype without a lot of scaffolding
Solid understanding of LLM-based systems and how context affects model behavior
You’d thrive if:
You've built knowledge graphs or entity-linking systems that had to survive noisy, real-world data
You've worked with context window limitations and what context to include as a first-class optimization problem
You've thought about how to represent code, changes, and issues in ways that enable learning
You're drawn to systems that compound in value with every failure they process
You’re interested in the intersection of code understanding, knowledge systems, and AI