Accelerator

B200 SXM

NVIDIA Blackwell sxm gpu summary for training, inference, and roofline-style performance analysis.

Back to Accelerator Catalog

Vendor
NVIDIA
Architecture
Blackwell
Unit
SXM GPU
Form factor
SXM
Launch
2024-03-18
Memory
180 GB HBM3E
HBM bandwidth
8 TB/s
BF16 peak
2.25 PFLOPS
FP16 peak
2.25 PFLOPS
FP8 dense peak
4.5 PFLOPS
FP8 sparse peak
9 PFLOPS
FP4 dense peak
9 PFLOPS
FP4 sparse peak
18 PFLOPS
FP64 peak
40 TFLOPS
INT8 peak
n/a
Interconnect
NVIDIA NVLink 5 / NVSwitch - 1.8 TB/s per GPU, derived from DGX B200 aggregate
Power
Platform dependent; DGX B200 is ~14.3 kW max for 8 GPUs
Software stack
CUDA, TensorRT-LLM, NVIDIA AI Enterprise

Notes

  • Per-GPU memory, bandwidth, FP4, and FP8 entries are derived from NVIDIA DGX B200 8-GPU aggregate specifications.
  • NVIDIA Blackwell adds FP4 tensor core support and a two-die GPU package connected by a 10 TB/s chip-to-chip link.

Sources