About Kilo: Kilo is the all-in-one, open-source agentic engineering platform. We're #1 on OpenRouter, have 1M+ active developers, and process over 21 trillion tokens monthly. We're growing fast, shipping constantly, and building in public.
Kilo Code is more than just another coding tool. We're creating a superset of the best features from existing AI coding agents, combined with our own innovations, all built with community feedback at the core.
The situation: We’re an open source AI coding tool taking on Cursor, Windsurf, and the rest. We are growing fast and want to work with engineers excited about accelerating software development across the globe while shipping features our users love.
We’re hiring an engineer to build the best autocomplete in the world.
Autocomplete is still the front door. It’s how most developers first learn to trust AI in their workflow — and it’s the gateway drug to agentic engineering. If we win autocomplete, we earn the right to graduate users into multi-step agent flows.
Why this matters at Kilo
Kilo is already a top-tier agentic engineering platform. But autocomplete is the first impression — and first impressions compound. You’re building the most powerful entryway into agentic engineering that exists.
If you want to obsess over quality, speed, and real-world evaluation — and you want to do it in public, at scale, with a team that ships uncomfortably fast — this is the seat.
This role is for an AI tinkerer: someone who gets weirdly excited about small deltas, tight eval loops, and the difference between “pretty good” and “industry best.”
What you’ll do
Own Autocomplete end-to-end: product feel, model behavior, latency, reliability, quality, and iteration velocity.
Build the benchmark that matters: not a vanity leaderboard — a real eval harness that tracks what users actually experience (acceptance rate, edit distance, time-to-accept, latency, regressions by language/project type).
Run tight experiment loops: tweak prompts, decoding, routing, caching, context packing, and ranking — then measure, learn, repeat.
Form strong opinions: FIM vs next-edit, speculative decoding, reranking, context window strategy, model routing, and how to trade off quality vs speed without lying to yourself.
Work directly with model creators: test new models early, give actionable feedback, and ship upgrades fast when something is truly better.
Ship constantly: small improvements weekly, bigger improvements monthly, and a compounding moat over time.
You’re a fit if…
You’ve built systems where a 1–2% improvement matters, and you know how to measure it without fooling yourself.
You love benchmarks, but only the kind that survive contact with production.
You can move between product intuition (“this feels wrong”) and hard engineering (“here’s the eval + traces + fix”).
You can be scrappy: quick prototypes, fast rollouts, clear rollbacks, no drama.
You want to be judged on outcomes: “autocomplete got meaningfully better,” not “we merged a bunch of PRs.”
The work setup:
Remote (strong preference for ET, CT, or Western Europe time zones) but we ship together in person every 2-3 months (think: hackathon energy, not conference rooms).
We work with folks from other timezones so doing effective hand-offs and writing things down or putting them in pull requests is critical. Prepare to be on lots of google meets calls but mostly short, focused 1:1 and very few meetings where you are just warming the chair
We thank people publicly, give feedback directly, and own our mistakes
Your dopamine needs to come from shipping, not from code for code's sake
Anti-requirements:
Needing pixel perfect designs before starting on something
Needing a QA department
“I didn’t build that, so I’m not fixing that."
Comfort with slow
Waiting for consensus before making a call
The Reality of Working at Kilo
We want to be transparent about what it takes to succeed here.
We Work Exceptionally Hard: We are ambitious, and ambition requires effort. Most people who do not succeed at Kilo fail because they underestimate the work ethic required. We’re not a 9-9-6 shop, but we do expect you to bring energy, intensity, and grit every single day.
High Accountability: We don't hide behind vanity metrics. You will own outcomes, not just pull requests. You will have a number (WAUzer), and you will be responsible for hitting it.
A Driven Culture: We foster healthy competition and a shared will to win. We support each other, but we also push each other to be the absolute best.
Apply with:
Links to stuff you’ve shipped, github profile, portfolio etc. Not required, but we want to get to know as much about you as we can before we meet!