This role is responsible for designing, implementing, managing, and optimizing the infrastructure, automation pipelines, and workflows that support the entire lifecycle of software development, data processing, Analytics and machine learning model deployment. This individual will be a key technical expert ensuring the reliability, scalability, efficiency, and speed of our development, data, analytics, and ML operations, fostering collaboration between teams and promoting best practices across DevOps, DataOps, and MLOps domains.
Here's what you'll be doing:
- Design, build, and maintain robust CI/CD pipelines for software applications, data transformations (ETL/ELT), and machine learning models (training, validation, deployment).
- Implement and manage Infrastructure as Code (IaC) using tools like Terraform to ensure reproducible and scalable environments (cloud or on-premise).
- Develop and automate data quality checks, data pipeline monitoring, and alerting systems within the DataOps framework.
- Establish and manage MLOps workflows including experiment tracking, model versioning, automated model retraining, and performance monitoring (drift, bias detection).
- Implement comprehensive monitoring, logging, and alerting solutions across all systems and pipelines (applications, data flows, ML models).
- Collaborate closely with software developers, data engineers, data scientists, and analysts to understand their needs and provide operational support and tooling.
- Champion and enforce best practices in security, reliability, and performance across all operational domains.
- Troubleshoot and resolve complex infrastructure, pipeline, and deployment issues.
- Evaluate and recommend new tools and technologies to improve operational efficiency and capabilities.
These objectives are not exhaustive and will evolve according to identified needs and current projects.
Qualifications
- Deep expertise in DevOps, DataOps, and MLOps principles, practices, and tooling.
- Mastery of CI/CD pipeline design and implementation for diverse artifacts (code, data, models).
- Strong proficiency in cloud infrastructure management and automation (Azure).
- Expertise in containerization and orchestration (Docker, Kubernetes).
- Strong scripting and automation skills (Python, Bash, etc.).
- Experience with monitoring, logging, and observability tools (e.g., Prometheus, Grafana, ELK Stack, Datadog).
- Understanding of data engineering concepts and data pipeline orchestration tools.
- Familiarity with ML model lifecycle management and associated tooling.
- Bachelor’s or master’s degree in computer science, Engineering, or a related field.
- 5+ years of experience in technical operations roles, with deep expertise in DevOps and demonstrable, hands-on experience implementing DataOps and MLOps practices.
- Proven experience with cloud platforms (mainly Azure, AWS is a plus), containerization (Docker, Kubernetes), CI/CD tools (e.g., Jenkins, GitLab CI, ArgoCD), IaC tools (e.g., Terraform), scripting (Python, Bash), data pipeline orchestration (e.g., Airflow), and ML platforms (e.g., MLflow, Kubeflow, SageMaker/Vertex AI).
- English or French Level: fluent
Additional Information
- Access to modern tools and technologies.
- A chance to make a meaningful impact on enterprise product quality.
- Opportunities for career growth and development.
- Soft skills and Technical Trainings, Workshops, Conferences, and attendance of different similar events