Gimlet Labs is building the first heterogeneous neocloud for AI workloads. As AI systems scale, the industry is hitting fundamental limits in power, capacity, and cost with today’s homogeneous, vertically integrated infrastructure. Gimlet addresses this by decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates each component to hardware that best fits its performance and efficiency needs. This approach enables heterogeneous systems across multi-vendor and multi-generation hardware, including the latest emerging accelerators. These systems unlock step-function improvements in performance and cost efficiency at scale.
On top of this foundation, Gimlet is building a production-grade neocloud for agentic workloads. Customers use Gimlet to deploy and manage their workloads through stable, production-ready APIs, without having to reason about hardware selection, placement, or low-level performance optimization.
Gimlet works with foundation labs, hyperscalers, and AI native companies to power real production workloads built to scale to gigawatt-class AI datacenters.
Gimlet Labs is seeking an Member of Technical Staff focused on AI research. As an AI Researcher, you will be evaluating and implementing techniques to drive performance and quality optimizations across the latest AI models. The research team is responsible for exploring new model architectures and experimenting with novel inference efficiency techniques such as KV caching and FlashAttention. The team will design and prototype frameworks leveraging fine-tuning and knowledge distillation to push the boundaries of model performance.
Responsibilities:
Monitoring and evaluating cutting-edge AI research
Researching ways to improve model accuracy, performance and efficiency
Prototyping frameworks with the latest fine-tuning and distillation techniques
Qualifications:
Master’s or PhD degree in computer science, engineering, applied mathematics or comparable area of study
Experience with AI/ML or applied data science.
Preferred Qualifications:
Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks
Software development experience with Python and C++
Understanding of the latest AI research and techniques
Strong foundation in statistical analysis