As AI models grow in complexity and scale, inference efficiency has emerged as a critical engineering challenge for enterprise deployment. Traditional infrastructure built for training workloads often fails to meet the latency, throughput, and cost demands of large-scale inference operations. In this session, Sandeep will be sharing practical insights from engineering AI infrastructure at Broadcom, focusing on the end-to-end optimization of compute, networking, and storage subsystems. The talk explores techniques such as dynamic workload placement, adaptive batching, model quantization,...
Sandeep Kaipu
Sandeep Kaipu is an Engineering Manager at Broadcom with 20 years of experience in software engineering leadership across AI infrastructure, cloud platforms, and enterprise systems. At Broadcom, he leads teams focused on building scalable AI and cloud infrastructure for enterprise deployments. Previously at VMware, he directed multiple R&D initiatives within VMware Cloud Foundation and Workspace ONE platforms. His career spans key engineering roles at Nokia, Samsung, and IBM, contributing to large-scale distributed systems and enterprise-grade software solutions. Sandeep holds a Master’s in Software Systems from BITS Pilani. He the author of the book AI Engineering Leadership and is a recognized speaker on AI infrastructure and engineering leadership topics.