Senior/Staff AI Engineer
Job Description:
- Build and optimize LLM serving and inference systems for production environments
- Improve performance across GPU and CPU pathways
- Work on KV cache, memory, storage, and throughput bottlenecks
- Design and scale systems that support RAG and retrieval-heavy AI workloads
- Contribute to infrastructure where storage architecture and systems efficiency materially affect AI performance
- Solve engineering problems at the intersection of AI, high-performance systems, and distributed infrastructure
Requirements:
- An engineer who has spent meaningful time building or optimizing production AI systems, not just experimenting with models
- Someone who understands how inference performance is shaped by the interaction between compute, memory, storage, and serving architecture
- Deep hands-on experience working close to the systems layer — for example, improving how workloads run across GPU and CPU resources, reducing bottlenecks, or tuning infrastructure for better throughput and latency
- Evidence of real ownership in areas like model serving, retrieval, caching, storage, or distributed performance, rather than purely application-layer AI work
- The ability to move comfortably between architecture decisions and hands-on implementation, especially in environments where efficiency and scale matter
- A background that suggests you can operate in technically demanding environments, whether that comes from AI infrastructure, high-performance systems, storage platforms, or adjacent distributed systems work
- PhD preferred, but far less important than having built serious systems in the real world.
Benefits: Apply tot his job Apply To this Job