Our partner is seeking a Service Delivery Manager in Lubbock, TX (on-site at TTU / regional). This full-time role within the Product department focuses on AI/HPC Infrastructure and Amalgamy.ai orchestration.
The Service Delivery Manager is responsible for ensuring the successful delivery, optimization, and long-term adoption of Amalgamy.ai-powered HPC environments in an academic and public-sector setting. This role serves as the primary connection between ThisWay Global and researchers, AI developers, and state stakeholders supporting accelerated compute initiatives. Success includes stable deployments, optimized workflows, strong stakeholder collaboration, and a feedback loop that informs the product roadmap, supporting a long-term public-private partnership delivering AI infrastructure for the State of Texas.
Responsibilities
- Serve as the primary technical point of contact for researchers and state stakeholders
- Ensure environments are delivered, operational, and aligned with researcher needs
- Maintain continuity across evolving compute initiatives and multi-year engagements
- Coordinate internal technical resources to support delivery and execution
- Oversee end-to-end delivery across engineering and infrastructure teams
- Identify risks early and drive resolution across teams
- Facilitate workshops and design sessions to support effective use of orchestration and workflow tools
- Translate research challenges into actionable product improvements
- Provide structured feedback to inform the Amalgamy.ai roadmap
- Identify workflow bottlenecks across compute, data movement, and architecture
- Guide tuning and optimization efforts to ensure environments deliver measurable value
- Ensure systems evolve as workloads scale or diversify
Requirements
- Hands-on familiarity with Linux environments, cluster architectures, and GPU-accelerated systems
- Understanding of how high-performance compute environments are designed, deployed, and optimized
- Ability to engage with AI researchers working on distributed training, simulation, LLMs, or digital twin workloads
- Knowledge of networking, storage, and orchestration layers
- Ability to translate technical challenges into clear requirements
- Strong written and verbal communication skills across technical and executive stakeholders
- Ability to work on-site in Lubbock, TX as required
Infrastructure & Architecture:
- Familiarity with data center fundamentals (power distribution, thermal loads, rack optimization)
- Understanding of high-speed interconnects (InfiniBand, advanced Ethernet topologies)
- Experience aligning hardware configurations to workload requirements
Systems & Automation:
- Experience administering Linux systems and monitoring performance
- Proficiency in Python or Bash for automation or troubleshooting
- Experience with orchestration and workload managers (e.g., Slurm, Kubernetes)
Performance & AI Workloads:
- Ability to identify bottlenecks across code, storage, networking, and hardware layers
- Exposure to distributed training, LLMs, neuromorphic AI, or simulation-focused environments