Accelerator

Cloud TPU v5p

Google TPU v5p TensorCore cloud tpu chip summary for training, inference, and roofline-style performance analysis.

Vendor

Google

Architecture

TPU v5p TensorCore

Unit

Cloud TPU chip

Form factor

Cloud TPU slice chip

Launch

2023-12-06

Memory

95 GB HBM2e

HBM bandwidth

2.765 TB/s

BF16 peak

459 TFLOPS

FP16 peak

n/a

FP8 dense peak

n/a

FP8 sparse peak

n/a

FP4 dense peak

n/a

FP4 sparse peak

n/a

FP64 peak

n/a

INT8 peak

n/a

Interconnect

ICI 3D torus - 1200 GB/s bidirectional per chip

Power

Not published per chip

Software stack

JAX, XLA, TensorFlow, PyTorch/XLA

Notes

v5p is oriented toward large-scale training and uses a 3D torus topology for full-cube and larger slices.
Google quotes BF16 peak compute per chip rather than a broad precision table like GPU vendors.