Listen

Description

In this episode of Inference Time Tactics, Rob, Cooper, and Byron sit down with Prashanth Velidandi, co-founder of InferX, to explore how serverless inference is tackling the AI “cold start problem.” They dig into why 90% of the model lifecycle happens at inference—not training—and how cold starts and idle GPUs are crippling efficiency. Prashanth explains InferX’s snapshot technology, what it takes to deliver sub-second cold starts, and why inference infrastructure—not just models—will define the next era of AI.

We talked about:

 

Connect with InferX

Prashanth Velidandi

https://inferx.net 

https://x.com/pmv_inferx 

https://www.linkedin.com/in/prashanth-velidandi-98629b115

Connect with Neurometric:
Website: https://www.neurometric.ai/ 

Substack: https://neurometric.substack.com/ 

X: https://x.com/neurometric/ 

Bluesky: https://bsky.app/profile/neurometric.bsky.social

 

Rob May

https://x.com/robmay 

https://www.linkedin.com/in/robmay

 

Calvin Cooper

https://x.com/cooper_nyc_ 

https://www.linkedin.com/in/coopernyc

 

Byron Galbraith

https://x.com/bgalbraith 

https://www.linkedin.com/in/byrongalbraith