Accelerator
Cloud TPU v6e / Trillium
Google TPU v6e TensorCore cloud tpu chip summary for training,
inference, and roofline-style performance analysis.
Back to Accelerator Catalog
Vendor Google
Architecture TPU v6e TensorCore
Unit Cloud TPU chip
Form factor Cloud TPU slice chip
Launch 2024-12-11
Memory 32 GB HBM
HBM bandwidth 1.6 TB/s
BF16 peak 918 TFLOPS
FP16 peak n/a
FP8 dense peak n/a
FP8 sparse peak n/a
FP4 dense peak n/a
FP4 sparse peak n/a
FP64 peak n/a
INT8 peak 1.84 POPS
Interconnect ICI 2D torus - 800 GB/s bidirectional per chip
Power Not published per chip
Software stack JAX, XLA, TensorFlow, PyTorch/XLA Notes
- Google positions v6e as the Trillium generation for transformer, text-to-image, CNN training, fine-tuning, and serving.
- A v6e pod has 256 chips and 234.9 PFLOPS of BF16 peak compute.
Sources