Plivo is a leading technology company transforming customer engagement for some of the world’s largest B2C brands, including Uber, WhatsApp, and Zomato. Our new product - the AI agents platform, automates the entire customer lifecycle - from acquiring, engaging, and supporting customers - through cutting-edge multimodal AI, including LLMs, text-to-speech, and speech detection. With a 100+ member team based out of India & US. We are building high-impact global products that handle over 1 billion API requests per month. If you are excited about solving hard, real-world AI challenges at scale, this is where you belong
Role overview
This is a deep systems and multidisciplinary role that bridges real-time communications (RTC), VoIP infrastructure, backend systems, and AI model development. You’ll architect and build a distributed RTC platform across the globe, develop backend services, and integrate AI models into production-grade voice and multimodal experiences at scale. If you love low-latency systems, real-time voice engineering, and AI-driven innovation, this is where you belong.
What You’ll Do
Design and build real-time voice systems using WebRTC, SIP/RTP and Websocket streaming.Engineer backend infrastructure for signaling, routing, call control, and audio/video processing.Work with open-source RTC stacks -- Freeswitch, Kamailio, Livekit, RTPEngine, and Pipecat.Develop and integrate AI capabilities, including: TTS (Text-to-Speech), STT (Speech-to-Text), VAD (Voice Activity Detection), Media servers and AI voice agentsBuild and scale a global, distributed RTC platform with strong resilience, observability, and low latency.I ntegrate AI/ML models into real-time voice systems (speech recognition, synthesis, embeddings).Build and scale a global, distributed RTC platform with strong resilience, observability, and low latency.Work across the stack -- from C/Go/Rust real-time components to Python/Node.js backend services, and our SDKs.Collaborate cross-functionally with Product, and DevOps teams.Instrument and monitor systems for quality, latency, and performance.Prototype rapidly: build, test, iterate, and deploy new RTC + AI features.Contribute to open-source voice and AI ecosystems.Be hands-on: Debug issues, tune queries, optimize performance, and improve resiliency, you own your code from dev to prod.Don’t be afraid to jump on a call or chat with a customer to ensure they have a smooth experience -- you own the outcome, not just the code.Use AI-assisted development tools to improve coding speed, testing, and code quality.
What You Bring
Strong foundation in systems programming -- C, Go, and/or Rust.Experience in backend and real-time systems engineering.Expertise in WebRTC, SIP, VoIP, and signaling/audio pipelines.Hands-on with open-source RTC stacks: Freeswitch, Kamailio, Livekit, RTPEngine, PipecatUnderstanding of media negotiation, codec pipelines, and audio/video streaming.Knowledge of real-time networking (UDP/TCP, ICE, NAT traversal).Experience building and scaling distributed RTC platforms.Experience with AI voice systems -- TTS, STT, VAD, LLM voice agents, or speech embeddings.Familiarity with AI/ML frameworks (PyTorch, TensorFlow, ONNX) or model integration.Backend development experience in Python or Node.js.Strong debugging, profiling, and performance optimization skills.Builder mindset: proactive, curious, and thrives in complex systems.
Bonus Points
Contributions to open-source RTC or AI projects.Familiarity with LLM integration and multimodal AI (voice + text).Experience in edge computing or real-time streaming optimization.Exposure to audio signal processing or DSP algorithms.Experience deploying real-time systems on cloud (AWS/GCP) with Docker/Kubernetes.Experience with AI voice agents, or voicebots.