AlayaCare

Director Site Reliability Engineering

AlayaCare Montréal, Quebec, Canada 20 days ago
engineering

About the Role

The Director, SRE leads the strategy, maintenance, and continuous improvement of infrastructure, monitoring systems, and applications, while overseeing on-call incident response. This role drives disciplined project execution, optimizes costs through FinOps practices, and ensures alignment of applications and architecture with long-term vision.

Working closely with Product, Engineering, and Customer Success, the Director, SRE supports smooth pre- and post-deployment operations, proactively addressing system challenges and optimizing performance. The role also leads incident response efforts, ensuring resilience and reliability.

As a people leader, the Director, SRE builds and develops a global SRE team, mentors managers, and fosters a strong team identity with clear goals and delivery plans.

What You'll Do

  • Vision & Strategy: Inspire and lead with a clear vision; define and communicate the SRE roadmap in alignment with business goals.
  • Governance & Alignment: Chair governance forums to align application architecture, workloads, observability and technology stack toward shared standards.
  • Execution Discipline: Drive disciplined project execution with clear reporting and progress tracking on key initiatives.
  • Cost Optimization: Lead FinOps practices to continuously optimize infrastructure costs, improve gross margins, and embed financial accountability across all engineering teams.
  • Operational Excellence: Oversee monitoring of infrastructure health, performance, and utilization to meet current and future company needs.
  • Incident Response: Manage and evolve the 24/7 monitoring and on-call rotation with a follow-the-sun approach between Australia and North America to ensure effective incident detection and resolution.
  • Collaboration: Partner closely with Product, Engineering, Customer Success, Support, and Data teams to proactively address system challenges and improve resilience.
  • People Leadership: Provide direction that inspires and aligns with business goals, while cultivating trust, collaboration, and inclusion across the organization, and developing future leaders through mentorship and talent growth.
  • Standards & Innovation: Establish, roll out, and continuously improve technical and operational best practices, staying ahead of industry trends.

What You Bring

  • A degree in Computer Science, Software Engineering, Mathematics, or similar.
  • 4+ years of experience as a people leader within an Engineering team. 
  • Strong interpersonal and communication skills with the ability to build relationships, influence outcomes, and drive alignment across functions and stakeholders at all levels of the organization.
  • Hands-on technical experience with cloud platforms (AWS), Linux, Kubernetes, microservices, and large-scale operational systems (billions of requests/month, 99.9%+ SLA).
  • Demonstrated success leading FinOps or cost optimization initiatives, and chairing governance architecture and processes to align technical and business priorities.
  • Strong program/project management discipline with a record of delivering complex engineering initiatives on time and at scale.
  • Familiarity with system and network security, data backup and recovery, business continuity, and disaster recovery practices.
  • Experience working with global cross-functional teams across time zones and cultures to achieve shared objectives.
  • Deep experience participating in or managing on-call rotations for monitoring, with proven incident response leadership.
  • Track record of recruiting, mentoring, and developing engineering leaders and fostering high-performing teams.
  • Excellent written and oral communication skills in English; bilingual French/English is an asset.

Sponsored

Explore Engineering

Skills in this job

People also search for