What is the salary for this MLOps Support Engineer role?

Salary information is not publicly listed for this position. Apply directly to discuss compensation with CloudFactory.

Where is this MLOps Support Engineer position located?

This is an on-site position at CloudFactory located in Kathmandu, Bagmati Province, Nepal, Asia.

How do I apply for this MLOps Support Engineer job at CloudFactory?

Click the 'Apply' button on this page to be redirected to CloudFactory's application portal. Make sure to have your resume ready and tailor your application to highlight relevant experience.

What is CloudFactory?

CloudFactory is actively hiring for Engineering roles. Visit the company page to see all open positions and learn more about working at CloudFactory.

MLOps Support Engineer

CloudFactory Kathmandu, Bagmati Province, Nepal Today

engineering

About the role:

The MLOps Support Engineer is an operations-first role, focused on ensuring AI/ML systems remain stable, observable, and supportable in production environments. This is not a data science or feature development role.

The primary objective is to maintain continuous performance of ML models and associated pipelines with minimal disruption to both internal and client-facing services. You will provide Tier 1 and Tier 2 support, escalating to Tier 3 Engineering as needed.

What you’ll do:

Provide Tier 1 / Tier 2 operational support for AI/ML solutions.
Identify failed jobs, degraded pipelines, or performance anomalies.
Triage incidents, investigate issues, and coordinate escalation to Tier 3 Engineering.
Participate in on-call rotas once established.
Validate that pipelines and jobs complete successfully.
Monitor data pipeline health, model execution, and basic performance metrics.
Identify operational issues before they impact customers
Respond or alert customers when there has been an outage or issue with one of their models.
Support incident management, rollback, and recovery activities.
Use and maintain runbooks and operational documentation.
Work with Engineering to improve supportability and observability.
Contribute to knowledge sharing to reduce single points of failure.
Work within defined SLAs and support processes as the service matures
Build quarterly business reviews to provide updates on the health of the ML Models.
Evaluate champion/challenger models to see if a new model should be promoted.
Monitor for model drift and performance degradation, while validating that updates (new champion models or added data) do not introduce bias.

Requirements

Essential

Experience in operations, DevOps, SRE, or platform support roles.
Strong troubleshooting skills in production environments.
Proficiency in SQL and scripting (Python, Bash) for developing and automating ML workflows.
Familiarity with Cloud-hosted systems (AWS, GCP, Azure) for cloud-based ML services.
Git: Solid understanding of version control, particularly in collaborative development environments.
Comfortable working from runbooks and structured processes.

Desirable

Exposure to AI/ML systems in production.
Familiarity with monitoring and observability tools (Grafana, PowerBI, New Relic).
Knowledge of MLOps tooling and data platforms (ML FLow, Databricks)
Experience supporting customer-facing platforms.
Knowledge of containerization (Kubernetes) is a plus.
Experience of LLM Prompt Engineering and troubleshooting
Early career in MLOps or ML Engineering.
Someone who is eager to learn about complex predictive models.
Background in computer science, informatics, or related fields
Passion for Machine Learning and AI: An eager learner who is excited about working with cutting-edge ML technologies and is passionate about optimizing and maintaining ML models in production environments.
Early Career in MLOps or ML Engineering: Ideally, Junior ML Engineer with a strong desire to grow in the field of MLOps and AI operations.
A Collaborative Mindset: You thrive in a team setting and are ready to contribute to model improvement, A/B testing, and iterative development.
Attention to Detail: A focus on model performance, bias prevention, and ensuring optimal model behavior as new data and models are introduced.

Additional information:

Nepal

This role provides MLOps coverage from 07:45 – 15:45* NPT for US-based customers. You will be required to work during these hours and potentially outside of them if a model has issues.
Rotational On-Call work will also be required.

Colombia

This role provides MLOps coverage from 11am to 9pm* Colombia time for a US-based customer. You will be required to work on a shift rota to cover 8 hour time blocks during this time period and potentially outside of them if a model has issues.
Rotational On-Call work will also be required.

*note that these hours are subject to change upon review.

Similar Jobs

MLOps Engineer

Weekday AI

MLOps Support Engineer

About the role:

What you’ll do:

Essential

Desirable

Additional information:

Explore Engineering

Skills in this job

People also search for

Similar Jobs

MLOps Engineer

MLOps Engineer

MLOps Engineer

MLOps Engineer

MLOps Engineer

More jobs at CloudFactory

MLOps Support Engineer

Senior Data Analyst

Lead Software Engineer

Lead Software Engineer

Independent UI/UX Designer (Contract)

Apply for this position

About CloudFactory

Similar Jobs

MLOps Engineer

MLOps Engineer

MLOps Engineer

MLOps Engineer

MLOps Engineer

More jobs at CloudFactory

MLOps Support Engineer

Senior Data Analyst

Lead Software Engineer

Lead Software Engineer

Independent UI/UX Designer (Contract)