Weekday AI

Technical Lead

Weekday AI Bengaluru, Karnataka, India 1 day ago
engineering

This role is for one of the Weekday's clients

Min Experience: 3 years

Location: Bengaluru

JobType: full-time

We are looking for a highly driven Technical Lead to work across a multi-product SaaS platform, owning system reliability, scalability, and technical execution. This is a horizontal leadership role spanning multiple products and core systems, ensuring platforms remain fast, secure, and resilient under scale and peak traffic conditions.

This is a hands-on technical leadership role, focused on architecture, reliability, and execution—not people management.

Requirements

Key Responsibilities

1. System Reliability & Performance (Primary Ownership)

  • Own and improve reliability metrics across products, including uptime, SLAs, and latency (P95).
  • Monitor and reduce application errors, bug leakage, and system failures.
  • Ensure correctness of distributed systems involving synchronous and asynchronous workflows.
  • Optimize queue processing, worker throughput, and caching layers (e.g., Redis).
  • Prepare systems for high-traffic events and peak load scenarios.
  • Lead root cause analysis and drive permanent, systemic fixes.
  • Act as the technical owner for incident resolution and long-term prevention.

2. Architecture & Scalability

  • Collaborate with senior technical stakeholders to evolve platform architecture.
  • Improve API design, data models, and system boundaries.
  • Design scalable distributed system patterns such as idempotent workflows, retries, batching, and fan-out orchestration.
  • Build and scale asynchronous pipelines for high-volume workloads.
  • Plan capacity for traffic spikes and introduce resilience patterns like circuit breakers and fail-safes.

3. Hands-On Engineering Leadership

  • Lead and review technical designs across teams and products.
  • Unblock engineers on complex architectural or performance challenges.
  • Own and drive cross-product refactors and technical debt reduction.
  • Enforce clean code standards, testing practices, and observability-first development.
  • Mentor engineers on debugging, system design, and performance optimization.

4. Observability & Monitoring

  • Define and maintain SLIs and SLOs across critical systems.
  • Build dashboards, alerts, and monitoring using logs, metrics, and traces.
  • Ensure issues are detected proactively before impacting users.
  • Work closely with platform teams to instrument distributed workflows end-to-end.

5. Security & Compliance

  • Ensure secure coding practices and adherence to compliance requirements (e.g., SOC 2).
  • Enforce proper secrets management, access controls, and audit logging.
  • Maintain data integrity, API security, and permission correctness across systems.

6. Cross-Functional Collaboration

  • Partner with Product teams to translate requirements into technically sound solutions.
  • Work with Support and Customer Success teams to deeply understand production issues.
  • Collaborate with Core Systems and Infrastructure teams to improve platform stability.
  • Align with QA teams to define testing strategies, including load, integration, and failure testing.

Requirements

Must Have

  • 3–4+ years of backend engineering experience (Python preferred).
  • Strong understanding of distributed systems and backend architecture.
  • Deep experience with SQL databases, data modeling, and query optimization.
  • Hands-on expertise with Redis, queues, async jobs, retries, and background processing.
  • Strong debugging skills across application and infrastructure layers.
  • Proven ability to lead technical decisions across multiple teams.
  • Experience improving system reliability and performance at scale.
  • Excellent communication and collaboration skills.

Nice to Have

  • Experience with observability tools such as Datadog, Sentry, or Elasticsearch.
  • Exposure to CRM integrations or large enterprise systems.
  • Prior ownership of reliability for multi-product SaaS platforms.
  • Familiarity with secure coding practices and compliance frameworks.

What Success Looks Like

0–3 Months

  • Gain a deep understanding of platform architecture and core systems.
  • Deliver quick reliability and performance improvements.
  • Become a go-to technical problem solver across teams.

4–6 Months

  • Establish clear SLIs and SLOs for key systems.
  • Introduce architectural guardrails and reduce operational noise.
  • Significantly lower error rates and production issues.

7–12 Months

  • Achieve high availability (99.9%+) across core platforms.
  • Ensure predictable and resilient async pipelines.
  • Improve performance under peak traffic conditions.
  • Enable faster engineering velocity through cleaner, more stable systems.

Skills

  • Backend Engineering
  • Distributed Systems
  • System Reliability
  • Relational Databases
  • Platform Scalability

Sponsored

Explore Engineering

Skills in this job

People also search for