Accelerator

Trainium

AWS NeuronCore-v2 cloud accelerator chip summary for training, inference, and roofline-style performance analysis.

Back to Accelerator Catalog

Vendor
AWS
Architecture
NeuronCore-v2
Unit
Cloud accelerator chip
Form factor
Trn1 instance chip
Launch
2022-11-28
Memory
32 GiB HBM
HBM bandwidth
0.82 TB/s
BF16 peak
190 TFLOPS
FP16 peak
190 TFLOPS
FP8 dense peak
n/a
FP8 sparse peak
n/a
FP4 dense peak
n/a
FP4 sparse peak
n/a
FP64 peak
n/a
INT8 peak
380 TOPS
Interconnect
NeuronLink-v2 - 384 GB/s inter-chip
Power
Not published per chip
Software stack
AWS Neuron SDK

Notes

  • Trainium appears to users through EC2 Trn1 instances, normally 16 chips per full instance.
  • AWS quotes FP16/BF16/cFP8/TF32 together for Trainium1 peak throughput.

Sources