About the company
Braintrust is the AI observability platform. By connecting evals and observability in one workflow, Braintrust gives builders the visibility to understand how AI behaves in production and the tools to improve it.
Teams at Notion, Stripe, Zapier, Vercel, and Ramp use Braintrust to compare models, test prompts, and catch regressions — turning production data into better AI with every release.
About the role
We’re looking for an AI Engineer to work directly with our most strategic customers and help them successfully deploy, scale, and extract value from Braintrust in real production environments.
This is a deeply technical, customer-facing role at the intersection of engineering, product, and go-to-market. You’ll partner closely with customer engineering teams to instrument real AI workflows, establish production baselines, operationalize evaluations, and build the feedback loops that make AI systems reliable at scale.
You will need strong judgment about how AI systems behave in the real world, how to evaluate them, and how to improve them iteration by iteration.
If you are excited by the challenge of solving open-ended problems, enjoy shipping quickly, and take full ownership of customer outcomes, this role offers outsized impact on both customer success and Braintrust’s product roadmap.
What you’ll do
Partner closely with customer engineering teams to deploy, stabilize, and continuously improve AI applications in production
Instrument and trace real-world AI workflows end-to-end, establishing baseline targets for latency, cost, quality, and reliability
Turn production data into datasets and evaluations; define scoring rubrics and implement CI quality gates
Build prototypes, integrations, and custom workflows that help customers operationalize evaluations and observability as part of their SDLC
Deploy and troubleshoot Braintrust in customer environments (cloud or self-hosted), working across application, data, and infrastructure layers
Act as the technical lead in customer engagements, running an operating cadence and feeding real-world learnings back into Product and Engineering
What you’ll bring
3–7+ years of experience as a software engineer or forward-deployed / field engineer
Strong backend or full-stack engineering skills (Python strongly preferred; TypeScript a plus)
Hands-on experience working with LLMs, APIs, or agentic workflows in production environments
Familiarity with cloud infrastructure and deployment patterns (AWS preferred; Docker/Kubernetes a plus)
Comfortable working directly with customers and owning technical outcomes end-to-end
Strong communication skills and ability to translate between business needs and technical implementation
Bias toward action: you enjoy shipping scrappy but production-ready solutions and iterating quickly
Nice to Haves
Experience with AI observability, evaluation frameworks, or ML/LLMOps tooling
Prior experience in a startup, founding team, or 0→1 product environment
Experience supporting enterprise or self-hosted deployments
Willingness to travel occasionally for on-site customer engagements
Benefits include
Medical, dental, and vision insurance
Daily lunch, snacks, and beverages
Flexible time off
Competitive salary and equity
AI Stipend
Equal opportunity
Braintrust is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.
Sponsored