Redpin

Senior Engineering Manager - SRE

Redpin India - Hyderabad 2 days ago
engineering

Location: [Hyderabad/Mumbai]
Function: Site Reliability Engineering, Engineering Operations, Application Support
Reporting to: Senior Director

About the Company

At Redpin we simplify life's most important payments. Buying a new property overseas can be a stressful time, especially when it comes to moving your money. Through our Currencies Direct and TorFX brands we've been helping people do just that for over 25 years. With recent investment we're now on a mission to build a new range of digital products and services that will make moving money Internationally for Real Estate purchases even easier

We’re on a mission to become the solution for Real Estate payments everywhere. To do this, we are transitioning our business from a horizontal FX platform to a verticalized, embedded software company, as we look to the future and Redpin 2.0.

 

About the Role

We are looking for a highly experienced Senior Engineering Manager to lead our Site Reliability Engineering, Platform Operations, and Application Support functions. In this role, you will own the reliability, stability, and performance of our production and non-production environments, ensuring world-class operational excellence for internal teams, customers, and partners.

You will lead 24×7 global teams, drive SRE best practices, reduce operational toil through automation, and build a proactive, data-driven reliability culture across Product, Engineering, and Infrastructure teams.

This is a strategic and hands-on leadership role for someone who thrives in complex, fast-paced environments and is passionate about improving observability, resilience, and customer experience.

 

What you'll do 

Reliability, Stability & Operational Excellence

  • Own end-to-end reliability and stability of all production and non-production environments.
  • Establish and evolve SRE practices including SLIs/SLOs, error budgets, operational readiness, and reliability reviews.
  • Drive improvements in operational metrics such as MTTR, availability, performance, throughput, and deployment success rate.
  • Lead proactive observability initiatives (logs, metrics, traces, dashboards, anomaly detection).

Application Support & Incident Management

  • Manage frontline L2/L3 application support teams operating 24×7.
  • Oversee incident response, escalation, root cause analysis (RCA), and post-mortem processes.
  • Build and mature Problem Management to reduce recurring incidents.
  • Ensure effective runbooks, playbooks, and knowledge bases.

Platform, Cloud & Infrastructure

  • Own AWS cloud operations including scaling, cost governance, security posture, and environment readiness.
  • Work with DevOps/Platform Engineering teams to enhance CI/CD pipelines, deployment reliability, and environment parity.
  • Introduce automation to eliminate toil and increase team productivity.

Monitoring, Reporting & Insights

  • Build and maintain operational dashboards for availability, performance, and platform health.
  • Generate and present regular reporting to leadership on reliability trends, risks, and improvements.
  • Implement proactive alerting strategies to detect issues early.

Strategic Initiatives & Continuous Improvement

  • Lead cross-functional initiatives to modernize infrastructure, improve resiliency, and scale systems sustainably.
  • Drive vendor assessments, tool evaluations, and engagement with external partners.
  • Support and contribute to security audits, compliance requirements (SOC2, ISO27001, GDPR).
  • Own and evolve the Business Continuity Plan (BCP) and disaster recovery strategy.

Cross-functional Leadership

  • Partner closely with Product, Engineering, QA, Security, and Customer Success teams to prioritize and deliver reliability-focused improvements.
  • Facilitate operational readiness for new releases, major product launches, and migrations.
  • Promote a culture of learning, transparency, and continuous improvement.

People Leadership

  • Manage and mentor a diverse team of SREs, platform engineers, and support analysts.
  • Build career development plans, coach team members, and foster a high-performance culture.
  • Recruit, grow, and retain top talent for SRE and Operations teams.
  • Lead with empathy, clarity, and strong communication.

 

What you'll need 

  • 12+ years of experience in SRE, DevOps, Production Engineering, Platform Operations, or Application Support.
  • Strong experience managing 24×7 support or operations teams.
  • Deep understanding of AWS cloud services (EC2, ECS/EKS, RDS, S3, IAM, Lambda, CloudWatch, etc.).
  • Proven ability to drive automation, build tooling, and reduce manual operational work.
  • Strong understanding of CI/CD pipelines, deployment processes, and environment management.
  • Experience with observability platforms (Datadog, Grafana, Prometheus, New Relic, ELK, Splunk, CloudWatch).
  • Solid understanding of incident management, RCA, post-mortems, and problem management frameworks.
  • Ability to lead technical and non-technical stakeholders toward alignment and shared outcomes.
  • Experience working in highly regulated environments with security, compliance, and audit.
  • Excellent analytical, communication, and decision-making skills.

 

Bonus Points 

  • Exposure to microservices, Kubernetes, containers, and high-scale distributed systems.
  • Experience with automation frameworks, scripting (Python, Bash), or Infrastructure-as-Code (Terraform).
  • Understanding of FinOps / cloud cost optimization.
  • Familiarity with ITIL concepts.
  • Experience managing multiple vendors or outsourced teams.
  • Experience leading reliability initiatives across global organizations.

Sponsored

Explore Engineering

Skills in this job

People also search for