Google TPU 8

Google Cloud’s eighth-generation TPU family announced at Next ’26, split into training-optimized TPU 8t and inference-optimized TPU 8i.

Key facts

Type: AI accelerator family
TPU 8t: Optimized for training; uses Inter-Chip Interconnect to scale up to 9,600 TPUs and 2 PB of shared high-bandwidth memory in one superpod ^[src-044]
TPU 8i: Optimized for inference; directly connects 1,152 TPUs in one pod and adds 3x more on-chip SRAM to host larger KV caches on-silicon ^[src-044]
Performance claims: TPU 8t delivers 3x Ironwood processing power and up to 2x performance/Watt; TPU 8i delivers 80% better performance per dollar for inference than the prior generation ^[src-044]
Agentic relevance: Google frames TPU 8i as enabling millions of concurrent agents cost-effectively ^[src-044]