Accelerator

Cloud TPU v6e / Trillium

Google TPU v6e TensorCore cloud tpu chip summary for training, inference, and roofline-style performance analysis.

Back to Accelerator Catalog

Vendor
Google
Architecture
TPU v6e TensorCore
Unit
Cloud TPU chip
Form factor
Cloud TPU slice chip
Launch
2024-12-11
Memory
32 GB HBM
HBM bandwidth
1.6 TB/s
BF16 peak
918 TFLOPS
FP16 peak
n/a
FP8 dense peak
n/a
FP8 sparse peak
n/a
FP4 dense peak
n/a
FP4 sparse peak
n/a
FP64 peak
n/a
INT8 peak
1.84 POPS
Interconnect
ICI 2D torus - 800 GB/s bidirectional per chip
Power
Not published per chip
Software stack
JAX, XLA, TensorFlow, PyTorch/XLA

Notes

  • Google positions v6e as the Trillium generation for transformer, text-to-image, CNN training, fine-tuning, and serving.
  • A v6e pod has 256 chips and 234.9 PFLOPS of BF16 peak compute.

Sources