Braintrust

AI Engineer

Braintrust Remote Today
data

About the company

Braintrust is the AI observability platform. By connecting evals and observability in one workflow, Braintrust gives builders the visibility to understand how AI behaves in production and the tools to improve it.

Teams at Notion, Stripe, Zapier, Vercel, and Ramp use Braintrust to compare models, test prompts, and catch regressions — turning production data into better AI with every release.

About the role

We’re looking for an AI Engineer to work directly with our most strategic customers and help them successfully deploy, scale, and extract value from Braintrust in real production environments.

This is a deeply technical, customer-facing role at the intersection of engineering, product, and go-to-market. You’ll partner closely with customer engineering teams to instrument real AI workflows, establish production baselines, operationalize evaluations, and build the feedback loops that make AI systems reliable at scale.

You will need strong judgment about how AI systems behave in the real world, how to evaluate them, and how to improve them iteration by iteration.

If you are excited by the challenge of solving open-ended problems, enjoy shipping quickly, and take full ownership of customer outcomes, this role offers outsized impact on both customer success and Braintrust’s product roadmap.

What you’ll do

  • Partner closely with customer engineering teams to deploy, stabilize, and continuously improve AI applications in production

  • Instrument and trace real-world AI workflows end-to-end, establishing baseline targets for latency, cost, quality, and reliability

  • Turn production data into datasets and evaluations; define scoring rubrics and implement CI quality gates

  • Build prototypes, integrations, and custom workflows that help customers operationalize evaluations and observability as part of their SDLC

  • Deploy and troubleshoot Braintrust in customer environments (cloud or self-hosted), working across application, data, and infrastructure layers

  • Act as the technical lead in customer engagements, running an operating cadence and feeding real-world learnings back into Product and Engineering

What you’ll bring

  • 3–7+ years of experience as a software engineer or forward-deployed / field engineer

  • Strong backend or full-stack engineering skills (Python strongly preferred; TypeScript a plus)

  • Hands-on experience working with LLMs, APIs, or agentic workflows in production environments

  • Familiarity with cloud infrastructure and deployment patterns (AWS preferred; Docker/Kubernetes a plus)

  • Comfortable working directly with customers and owning technical outcomes end-to-end

  • Strong communication skills and ability to translate between business needs and technical implementation

  • Bias toward action: you enjoy shipping scrappy but production-ready solutions and iterating quickly

Nice to Haves

  • Experience with AI observability, evaluation frameworks, or ML/LLMOps tooling

  • Prior experience in a startup, founding team, or 0→1 product environment

  • Experience supporting enterprise or self-hosted deployments

  • Willingness to travel occasionally for on-site customer engagements

Benefits include

  • Medical, dental, and vision insurance

  • Daily lunch, snacks, and beverages

  • Flexible time off

  • Competitive salary and equity

  • AI Stipend

Equal opportunity

Braintrust is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Sponsored

Explore Data

Skills in this job

People also search for