Is the Data Engineer role at Kepler Ai remote?

Yes, the Data Engineer role at Kepler Ai is a remote position.

How do I apply for the Data Engineer position at Kepler Ai?

You can apply for the Data Engineer position at Kepler Ai directly through HireHere. Click the "Apply" button on the job listing to be taken to the application page.

Data Engineer

Kepler AiRemote3h ago

We're building the ground truth platform for AI.

Generic tools hallucinate data, confabulate reports, and don't show their work. We made accuracy the only possible outcome: every answer traces to its source, every calculation is reproducible, every insight is defensible. We're starting in finance and building the foundational data layer for anywhere decisions depend on trustworthy data.

Kepler was founded by two Palantir veterans (20 years combined) who built core parts of Gotham and Foundry, created Palantir Quiver (the analytics engine behind $100M+ deals with BP and Airbus), led major DoD projects, and served as Head of Business Engineering at Citadel.

Kepler is backed by founders of OpenAI, Facebook AI, MotherDuck, dbt, Outerbounds, and others.

The Role

You'll build and maintain the data pipelines and infrastructure that power Kepler's AI-driven research platform. Financial data is fragmented and messy: SEC filings, earnings transcripts, market data feeds, research reports, and internal documents. You'll help ingest, structure, and unify all of it into a coherent system where every answer traces back to its source.

This is a greenfield environment with significant opportunity to influence technical direction, establish best practices, and grow alongside the platform.

Within Your First 90 Days

Own and ship a major data pipeline end-to-end
Contribute to foundational technology decisions that shape platform architecture
Build ingestion systems that power real financial research workflows
Help establish data engineering patterns and best practices for the team

What You'll Do

Build & Maintain Data Pipelines

Design and implement ingestion pipelines from heterogeneous sources: SEC filings, earnings transcripts, market data, research reports, and internal documents
Handle structured, unstructured, and semi-structured data formats
Ensure pipelines are reliable, scalable, and well-monitored

Support Data Architecture

Contribute to decisions around storage technologies, indexing strategies, and retrieval systems
Build semantic layers that normalize entities across sources and resolve ambiguity
Implement data provenance so every number traces to a source document and section

Enable AI & Analytics Workloads

Build infrastructure for document processing, embedding pipelines, and vector search
Support retrieval systems that surface the right context from millions of documents
Collaborate with AI/ML engineers to ensure data infrastructure meets model requirements

Ensure Data Quality & Governance

Build and maintain observability, monitoring, and validation systems
Implement data quality frameworks and governance standards
Own data reliability metrics and drive continuous improvement

Ship with Production Excellence

Write comprehensive tests and maintain CI/CD deployment pipelines
Participate in code reviews and contribute to engineering best practices
Monitor production systems and respond to data quality issues

What We're Looking For

Required

5+ years of data engineering experience building production data pipelines and platforms
Strong experience designing ingestion, storage, transformation, and retrieval systems
Proficiency working with structured, unstructured, and semi-structured data
Hands-on experience with modern data stack tools: orchestration (e.g., Airflow, Temporal), storage (e.g., PostgreSQL, S3), and processing frameworks
Solid understanding of SQL, Python, and at least one systems language (Rust, Go, etc.)
Experience with Git workflows, CI/CD, and automated testing
Strong communication skills, able to articulate technical trade-offs to both engineering and business stakeholders
Thrives in fast-paced, high-ownership environments

Nice to Have

Experience with vector databases, embedding pipelines, or retrieval-augmented generation (RAG) systems
Familiarity with document processing or audio data pipelines
Financial services or fintech data experience
Experience with data quality frameworks and governance tooling
Exposure to Kubernetes, Docker, and infrastructure-as-code (e.g., Pulumi, Terraform)

Don't check every box? Apply anyway. We prioritize speed of learning, problem-solving skills, attention to detail, and drive to build world-class data infrastructure.

Mentorship & Growth

Direct collaboration with founders who built Palantir Foundry and data infrastructure at Citadel
Weekly 1:1s with founders
Architectural reviews and guidance on data system design
Clear growth path toward senior data engineering and platform leadership roles

Our Technical Stack

Frontend: React, Typescript, Vite, Tailwind, Radix, TanStack, Zustand
Backend: Rust, Node.js, Python, PostgreSQL, Redis
AI/ML: OpenAI, Anthropic, MCP SDK,
Infrastructure: AWS (S3, RDS), Docker, Temporal, Kubernetes, Dataflow
Tools: Git, GitHub, Pulumi, Auth0, SharePoint

Benefits

Comprehensive medical, dental, vision, 401k, insurance for employees and dependents
Automatic coverage for basic life, AD&D, and disability insurance
Daily lunch in office
Development environment budget - latest MacBook Pro, multiple monitors, ergonomic setup, and any development tools you need
Unlimited PTO policy
"Build anything" budget - dedicated funding for whatever tools, libraries, datasets, or infrastructure you need to solve technical challenges, no questions asked
Learning budget - attend any conference, course, or program that makes you better at what we're building

Our Operating Principles

Forward-Deployed with Product DNA: We own customer outcomes while building a product company. That means embedding, iterating, and deploying where our customers are. We don't win if they don't win.
Extreme Ownership: Big vision, shared ownership. If you notice a problem, you own it. Authority comes from initiative, not job titles. Once you step up, you're accountable for the outcome.
Production-First Engineering: We design for critical workloads from day one. Durable execution, blue/green deploys, automated rollbacks, continuous delivery with end-to-end observability. Every change lands safely and stays resilient under real-world load.
Trust as the Default: People do their best work when confidence is mutual. We show our work, keep our promises, and flag risks before they bite. Trust isn't an aspiration. It's the baseline.
Keep Raising the Bar: We block time for training, code-health sprints, and deep-dive tech talks. A sharper team and a cleaner stack pay compounding dividends. Continuous learning isn't a perk. It's part of the job.

Kepler is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind. We are committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment.