As AI workloads become more distributed and latency-sensitive, deciding where processing occurs—on devices, at the edge, or in data centers—is becoming a key architectural challenge. This session explores how infrastructure owners, telcos, and cloud providers are allocating AI compute to balance performance, cost, and energy use. Panelists will discuss inference strategies, model partitioning, and orchestration, and how these trends are shaping investment in edge and distributed AI infrastructure.