Who we are
We're Redis. We built the product that runs the fast apps our world runs on. (If you checked the weather, used your credit card, or looked at your flight status online today, you’re welcome.) At Redis, you’ll work with the fastest, simplest technology in the business—whether you’re building it, telling its story, or selling it to our 10,000+ worldwide customers. We’re creating a faster world with simpler experiences. You in?
Why you'll love this job
As a Manager in Cloud Operations at Redis, you will lead a team of senior engineers, responsible for keeping Redis Cloud reliable, available, and fast for customers worldwide. You’ll combine hands-on technical leadership with day-to-day people management, helping your team grow while maintaining strong operational standards.
You will work in a dynamic, fast-paced, large-scale, multi-cloud production environment, collaborating closely with R&D and other engineering teams, and continuously improving how we run and scale our distributed systems. You will also drive innovation in how the team operates by leading cross-functional projects that introduce practical new tools, automation, and ways of working into production. Your leadership will help align operations with business priorities, build a strong operations culture, and steadily improve our reliability and efficiency in a fast‑evolving, technical domain.
Our ideal candidate thrives on leading multiple, high‑visibility operational initiatives, communicates clearly with different stakeholders, and enjoys working in an environment that demands proactive risk management, cross‑team collaboration, and continuous improvement. If you are driven by ownership, enabling teams, and delivering reliable cloud services at scale, this is your opportunity.
What you’ll do
- Lead and develop a CloudOps team in Israel as part of the global Cloud Operations organization, setting clear goals, expectations, and ways of working, and investing in people's growth and performance.
- Act as a hands-on technical leader in a cloud‑native, high‑scale environment, with focus on reliability, resiliency, observability, automation, performance, and cost efficiency.
- Own and improve core operational processes: on‑call and incident response, escalations, change management, runbooks, production‑readiness, and post‑incident reviews that drive real follow‑through.
- Partner directly and proactively with R&D, Platform, Product, Support, and Customer Success to shape and reduce reliability risks, improve deployment safety and performance, and ensure customer‑impacting issues are tracked to closure without constant reminders.
- Use data, metrics, and observability tooling, together with automation and AI‑driven workflows, to measure system health, guide decisions, identify patterns, and drive continuous improvement in reliability and operational excellence.
What you’ll need to have
- 5+ years of experience in Cloud Operations, SRE, Production Engineering, or similar roles in large‑scale production environments, including 3+ years managing or leading engineering teams.
- Strong hands‑on experience with at least one major public cloud (AWS, GCP, or Azure) running production workloads at scale, balancing resilience, performance, and cost.
- Good understanding of Linux and networking fundamentals, and hands-on experience with automation or scripting. Exposure to modern observability and incident management tools and practices, and familiarity with databases or distributed data systems (experience with Redis or similar technologies is a strong plus).
- Proven ability to lead teams through incidents and operational change with clear, calm communication under pressure, and to collaborate effectively across time zones and multiple stakeholder groups.
- A high degree of ownership and accountability, with a data‑driven approach to prioritization and decision‑making, and the ability to balance process discipline with pragmatism and speed to delivery.
Extra great if you have
- Experience leading operational projects (e.g., automation, reliability improvements, cost optimization) and contributing from design to rollout.
- Experience reliability metrics, and incident response processes, including standardizing RCAs and driving long‑term fixes.
- Familiarity with ITIL/ITSM concepts (incident, change, problem management, and security/compliance processes) adapted for modern cloud operations.
- Experience with cost optimization, capacity planning, or FinOps‑related practices in large‑scale cloud environments.
- A track record of automation and process transformation in distributed teams, turning ad‑hoc workflows into scalable, repeatable, and well‑documented operational practices.
#LI-BL1
#LI-HYBRID
Sponsored
Explore Engineering
Skills in this job
People also search for
Similar Jobs
Cloud Operations Specialist
Shift Technology
Cloud Operations Engineer
NICE
Cloud Operations Engineer
Axiom Software Solutions Limited
Cloud Operations Engineer
Netcompany
Senior Cloud Operations Engineer (Google Cloud Platform)
Master-Works
More jobs at Redis
Similar Jobs
Cloud Operations Specialist
Shift Technology
Cloud Operations Engineer
NICE
Cloud Operations Engineer
Axiom Software Solutions Limited
Cloud Operations Engineer
Netcompany
Senior Cloud Operations Engineer (Google Cloud Platform)
Master-Works