What is the salary for this Senior Software Engineer - Research role?

Salary information is not publicly listed for this position. Apply directly to discuss compensation with Lightricks.

Where is this Senior Software Engineer - Research position located?

This is an on-site position at Lightricks located in Jerusalem.

How do I apply for this Senior Software Engineer - Research job at Lightricks?

Click the 'Apply' button on this page to be redirected to Lightricks's application portal. Make sure to have your resume ready and tailor your application to highlight relevant experience.

Lightricks is actively hiring for Engineering roles. Visit the company page to see all open positions and learn more about working at Lightricks.

Senior Software Engineer - Research

Lightricks Jerusalem 19 days ago

engineering

Who we are

LTX is pioneering a new era of visual rendering tools that empower professionals and creators across creative industries such as Advertising, Creative Marketing, 3D Design, Animation, and more. Our latest release, LTX-2, unites high-quality synchronized audio and video generation, native 4K fidelity, long sequences, and radical efficiency - all in a single, production-ready system. At LTX, we’re building a new foundation that transforms the relationship between creation and technology.

What you will be doing

As an ML Software Engineer with a focus on low-level and CUDA-based optimizations, you will play a key role in shaping the design, performance, and scalability of Lightricks’ machine learning inference systems. You’ll work on deeply technical challenges at the intersection of GPU acceleration, systems architecture, and ML deployment.
Your expertise in CUDA, C/C++, and performance tuning will be crucial in enhancing runtime efficiency across heterogeneous computing environments. You’ll collaborate with designers, researchers, and backend engineers to build production-grade ML pipelines that are optimized for latency, throughput, and memory use, contributing directly to the infrastructure powering Lightricks' next-generation AI products.
This role is ideal for an engineer with strong systems-level thinking, deep familiarity with GPU internals, and a passion for pushing the boundaries of performance and efficiency in machine learning infrastructure.

Responsibilities

Design and implement highly optimized GPU-accelerated ML inference systems using CUDA and low-level parallelism techniques
Optimize memory, compute, and data flow to meet real-time or high-throughput constraints
Improve the performance, reliability, and observability of our inference backend across diverse compute targets (CPU/GPU)
Collaborate with cross-functional teams (including researchers, developers, and designers) to deliver efficient and scalable inference solutions
Contribute to ComfyUI and internal infrastructure to improve usability and performance of model execution flows
Investigate performance bottlenecks at all levels of the stack—from Python to kernel-level execution
Navigate and enhance a large, complex, production-grade codebase
Drive innovation in low-level system design to support future ML workloads

Your Skills and Experience

5+ years of experience in high-performance software engineering
Advanced proficiency in CUDA, C/C++, and Python, especially in production environments
Deep understanding of GPU architecture, memory hierarchies, and optimization techniques
Proven track record of optimizing compute-intensive systems
Strong system architecture fundamentals, especially around performance, concurrency, and parallelism
Ability to independently lead deep technical investigations and deliver clean, maintainable solutions
Collaborative and team-oriented mindset, with experience working across functional teams

Preferred Requirements

Experience with low-level profiling and debugging tools (e.g., Nsight, perf, gdb, VTune)
Familiarity with machine learning frameworks (e.g., PyTorch, TensorRT, ONNX Runtime)
Contributions to performance-critical open-source or ML infrastructure projects
Experience with cloud infrastructure and GPU scheduling at scale