This role aims to design, implement, and maintain scalable, secure, and reliable MLOps infrastructure and CI/CD pipelines to enable rapid and high-quality delivery of machine learning models and data-driven services to production. The role bridges ML/Development and Operations, driving automation, reliability, monitoring, and operational excellence across environments.
Key Responsibilities
- Build and operate end-to-end pipelines for training, validation, packaging, and deployment across dev/test/prod.
- Implement CI/CD for code, data, and model artifacts with quality gates, approvals, and rollbacks.
- Deploy and scale ML services using Docker and Kubernetes (real-time and batch), with safe rollout strategies.
- Set up model registry & experiment tracking and enforce reproducible, versioned releases (e.g., MLflow or equivalent).
- Implement monitoring/alerting for service health, latency, errors, resource usage, plus ML signals (drift, data quality, model performance).
- Define operational standards (SLIs/SLOs, incident response, RCA, runbooks) and continuously improve reliability.
- Enforce security best practices (IAM/RBAC, secrets management, network controls, audit logging) and collaborate with DS/ML/Data teams.
Requirements
Requirements
- 3–7 years in MLOps/DevOps/Platform roles with production ML exposure.
- Strong CI/CD + automation, solid Python and Linux, strong troubleshooting.
- Hands-on with Docker + Kubernetes and observability tools (Prometheus/Grafana, ELK, OpenTelemetry or similar)
Sponsored
Explore Engineering
Skills in this job
People also search for
Similar Jobs
More jobs at Master-Works
Apply for this position
Sign In to ApplyAbout Master-Works
Master Works is a leading provider of tailored Information Technology (IT) solutions. Our expert team specializes in software development, cybersecurity, cloud computing, data analytics, and IT consulting.