About the team & the platform you’ll own
Data Engineering is part of the Data Team. The team build, maintain, and continuously improve the company’s data infrastructure covering the Data Lake, CDC (Change Data Capture), compute clusters, and our workflow orchestrator (Apache Airflow) along with other analytics workloads that support decision-making across the business.
Today, we ingest and process tens of terabytes of data, with an aggregate workload of 1,000+ runtime hours per day across our pipelines and jobs. This role is for someone who gets excited about operating data systems at scale with reliability, performance, cost efficient, and clean engineering practices.
About the role
We’re looking for a Senior Data Engineer to design and build robust, scalable data systems with a reliability-first mindset and strong software engineering fundamentals. You’ll lead architecture and technical decisions for pipelines (batch + streaming where applicable), mentor other engineers, and raise engineering standards across the team.
This is a hands-on role: you’ll ship, operate, and improve production systems, not just design them.
What you’ll do
Build and operate scalable data systems
- Design, build, and maintain scalable data pipelines (batch and streaming) that are reliable, observable, and cost-effective.
- Own end-to-end pipeline architecture: ingestion → processing → storage → serving, including data modeling and performance considerations.
- Improve and extend our core infrastructure: data lake, CDC pipelines, compute cluster workloads, and Airflow orchestration.
- Work deeply with distributed processing and data lake concepts, including performance tuning and stability at scale.
Engineering excellence & production readiness
- Develop in Python or at least one big-data language (e.g., Scala or Go), writing clean, modular, testable code.
- Apply strong software engineering practices: design patterns, trade-offs, DRY principles, dependency management, code reviews, and CI/CD.
- Raise the bar on documentation: architecture diagrams, data contracts, operational playbooks, runbooks, and decision records.
Reliability, observability, and incident ownership
- Define and operate system observability:
- establish metrics/dashboards (latency, throughput, failure rate, resource usage, SLA/SLO adherence)
- implement alerting + runbooks
- Lead root-cause analysis for complex incidents and recurring failures; implement permanent fixes (not just patches).
- Partner cross-functionally with analytics, product, platform, and DevOps teams to align data solutions with business needs.
Leadership & mentoring
- Mentor and level up other engineers through pairing, reviews, technical guidance, and best-practice evangelism.
- Lead technical discussions, drive alignment, and make pragmatic decisions with clear trade-offs.
The “extra mile” mindset we value
We value engineers who don’t stop at “it works.” You’ll thrive here if you naturally:
- Stay with hard problems until the real root cause is found (not just symptoms).
- Use a “detective” approach: form hypotheses, validate with evidence, and iterate quickly.
- Go beyond your immediate area to unblock solutions, including:
- reading internal tooling or framework code when needed (and occasionally digging into upstream/open-source source code to understand behavior)
- collaborating across teams to trace system boundaries and ownership
- building reproducible test cases, simulations, or load tests to validate fixes and performance changes
- creating small tools/scripts to diagnose production issues or prevent regressions
What we’re looking for (must-have)
1) Technical competencies
- 3+ years of experience in data engineering (or equivalent experience building production-grade data systems).
- Strong coding ability in Python, plus experience in Scala and/or Go (or strong ability to ramp quickly).
- Strong grasp of system design, design patterns, and engineering trade-offs.
- Experience designing robust pipelines end-to-end (batch + ideally streaming).
- Solid SQL skills and strong understanding of data modeling:
- OLTP vs OLAP, star schema, partitioning strategies, and how modeling impacts performance and usability
- Hands-on experience with distributed processing and big data systems (e.g., Spark, EMR, data lake architectures).
- Strong operational mindset: observability, reliability, and performance optimization.
2) Behavioral & leadership competencies
- Demonstrated ability to lead technical discussions and drive decisions.
- Strong ownership: you take problems from unclear to solved, and you close loops.
- Comfortable mentoring junior engineers and raising team standards.
- Clear communication, especially around constraints, risks, and trade-offs.
- Strong documentation habits and a reliability-first mindset.
Nice-to-haves / advantages
- Active involvement in open source, technical blogging/writing, hackathons, or building meaningful side projects.
- Experience with streaming ecosystems (e.g., Kafka-style patterns), data contracts, schema evolution, or event-driven architectures.
- Experience implementing data quality frameworks (tests, anomaly detection, freshness checks).
- Cost optimization experience in cloud data platforms (compute/storage trade-offs).
What success looks like in the first 3–6 months
- You’ve improved the reliability and observability of key pipelines (clear metrics, alerts, fewer incidents).
- You’ve delivered at least one meaningful pipeline or architecture improvement that scales better and is easier to operate.
- You’ve led one or more root-cause deep dives and implemented fixes that prevent recurrence.
- You’ve strengthened team execution through mentoring, reviews, and better engineering practices.