Scale-up fabrics

Interconnect Catalog

Comparison of node-local and rack-scale interconnects used to connect CPUs, GPUs, accelerators, memory expanders, and switches.

How to read this table

Bandwidth can mean unidirectional, bidirectional, per-link, per-device, or aggregate fabric bandwidth. This table keeps the wording explicit.

Topology and software stack matter as much as link rate for collective communication and model-parallel training.

Metric	NVIDIA NVLink 5 / NVSwitch	AMD Infinity Fabric	PCIe Gen5	CXL 3.x	UALink
Scope	GPU-to-GPU and rack-scale accelerator fabric	GPU package, CPU socket, and accelerator baseboard fabric	Host I/O bus	Coherent host-device and memory expansion fabric	Open accelerator scale-up fabric
Topology	Direct GPU links plus switched NVLink domains	Product-specific mesh, die-to-die, and board-level links	Root complex to endpoints and switches	PCIe physical layer with switching and fabric capabilities in newer generations	Accelerator-to-accelerator scale-up network
Coherence	GPU memory fabric semantics; coherent CPU-GPU via NVLink-C2C in Grace Hopper class systems	Coherent within AMD CPU complexes; accelerator semantics vary by platform	Non-coherent by default; CXL layers add coherent protocols where supported	CXL.cache and CXL.mem provide coherent semantics	Intended for AI accelerator memory and collective communication semantics
Example bandwidth	Blackwell-class GPUs quote up to 1.8 TB/s per GPU	MI300X platform lists 896 GB/s bidirectional Infinity Fabric bandwidth per GPU	32 GT/s per lane; about 128 GB/s bidirectional for x16	Follows PCIe generation and lane width	Version and implementation dependent
Common usage	DGX B200, GB200/GB300 NVL systems, large model training and inference	EPYC chiplet fabrics, Instinct accelerators, multi-GPU platforms	GPUs, NICs, SSDs, CXL devices, accelerator attachment	Memory expansion, memory pooling, coherent accelerators	Emerging alternative for multi-vendor accelerator scale-up systems
Notes	Best thought of as a scale-up accelerator fabric rather than a general-purpose I/O bus.	The same brand covers several related fabrics, so always cite the specific product context.	PCIe is universal and flexible, but lower bandwidth and higher software overhead than dedicated scale-up fabrics.	CXL changes the memory hierarchy more than raw link bandwidth; latency and coherency model are the interesting bits.	Worth tracking because open scale-up fabrics may matter for non-NVIDIA AI systems.
Sources	NVIDIA NVLink overview	AMD MI300X platform data sheet	PCI-SIG PCIe 5.0 overview	CXL Consortium specifications	UALink Consortium