About Us
Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80% of internet traffic and has become the enabling digital medium powering creativity, communication, gaming, AR/VR, and robotics. Sieve exists to solve the biggest bottleneck in growth of these applications: high-quality training data.
We've partnered with top AI labs and did $XXM last quarter alone, as a team of just 12 people. We also raised our Series A earlier this year from Tier 1 firms such as Matrix Partners, Swift Ventures, Y Combinator, and AI Grant.
About the Role
As a distributed systems engineer at Sieve, you’ll design and engineer systems that handle the compute, scheduling, and orchestration of complex ML + ETL pipelines that need to run quickly, reliably, and cost-effectively on large sums of video.
You’re likely a good fit if you love optimizing for system uptime, have worked with cloud technologies, optimizing hyper-fast distributed systems at the scale of thousands of GPUs, and building great internal tooling and CI/CD for rapid iteration.
Requirements
3+ years of experience building foundational data infrastructure
Proficient in working across diverse cloud architectures
Designed and maintained pipelines that process petabytes of data
Developed robust CI/CD pipelines tailored for ML-focused teams
Strong coding experience with Go and Python; Experience with Rust is a plus
Operates as an IC who leads by example
Experience with large-scale video data systems
In-person at our SF HQ
Sponsored