What is the salary for this Data Engineer role?

Salary information is not publicly listed for this position. Apply directly to discuss compensation with Mirelo.

Where is this Data Engineer position located?

This is an on-site position at Mirelo located in Remote.

How do I apply for this Data Engineer job at Mirelo?

Click the 'Apply' button on this page to be redirected to Mirelo's application portal. Make sure to have your resume ready and tailor your application to highlight relevant experience.

Mirelo is actively hiring for Data roles. Visit the company page to see all open positions and learn more about working at Mirelo.

Data Engineer

Mirelo Remote Today

data

Mirelo AI is building the next generation of creative tools by generating realistic sound, speech and music from video.

We develop cutting-edge foundational generative AI models that "unmute" silent video content and create custom, hyper-realistic audio for gaming, video platforms, and creators. Our technology empowers global storytellers to transform their content.

We recently closed a $41 million Seed round co-led by Andreessen Horowitz and Index Ventures with participation from Atlantic, and are rapidly expanding across Product, Engineering, Go-to-Market, and Growth.

About the Role

At Mirelo, the quality of our models depends on the scale and depth of the data behind them. In this role, you’ll build and run the systems that power our entire training pipeline - from acquiring massive audio and multimodal datasets to shaping them into something our research team can actually train on. You’ll work across infrastructure, tooling, and annotation workflows, using a mix of automation, ML-based filtering, and hands-on evaluation to ensure our data is both clean and comprehensive. As part of the model development loop, you’ll help us understand what data we’re missing and move quickly to fill those gaps, making this role central to how our next-generation audio models evolve.

Key Responsibilities

Data acquisition

Develop and run scalable infrastructure for acquiring massive-scale audio (sound and music) and multimodal video-audio datasets
Coordinate data transfers from licensing partners and turn heterogeneous sources into training-ready datasets

Annotation and data quality

Obtain detailed annotations for audio and video data (descriptions, musical attributes, audio attributes, …)
Use state-of-the-art ML models for data cleaning, processing and filtering
Ensure data quality by automated tools and manual evaluation studies
Build scalable tools to analyze our datasets (compute statistics, create visualizations, …)

Efficient workflows and collaboration

Optimize and parallelize data processing workflows to handle massive-scale datasets efficiently across both CPUs and GPUs
Work directly in the model development loop, updating datasets as training trajectories reveal what we're missing

Ideal Candidate Profile

Strong proficiency in Python and experience with various file systems for data-intensive manipulation and analysis
Hands-on familiarity with cloud platforms (AWS, GCP, or Azure) and Slurm/HPC environments for distributed data processing
Experience with audio and video processing libraries (ffmpeg, …) and an understanding of their performance characteristics
Demonstrated ability to optimize and parallelize data workflows across both CPUs and GPUs
Knowledge of machine learning techniques for data cleaning and preprocessing

Nice to Have

Have built or contributed to large-scale data acquisition systems and understand the operational challenges
Have implemented data processing and cleaning pipelines at scale
Familiarity with audio and video annotation processes for ML and experience with the specifics of audio data
Have been part of shipping a state-of-the-art model and understand how data decisions impact training outcomes

Why Join?

Join at a pivotal moment. We've secured fresh funding and are gaining traction - now is when your contributions can make a real difference to our success.
True ownership from day one. You'll have genuine autonomy and responsibility. Your ideas and work will directly shape our product and company direction.
Competitive compensation and equity. We offer strong packages that ensure you share in the success you help create.
Build for the next generation of creators. Be part of the innovation that will transform how creators work and thrive.

We welcome applications from all individuals, regardless of ethnic origin, gender, disability, religion or belief, age, or sexual orientation and identity.