TL;DR
- Quantum-2 is NVIDIA's NDR-generation InfiniBand switch ASIC, providing 64 ports of 400 Gb/s (or 128 ports of 200 Gb/s split mode) per 1U appliance.
- Implements SHARPv3 in-network reduction, adaptive routing, and credit-based lossless flow control with sub-microsecond cut-through latency.
- Sold as the MQM9700 series (64-port unmanaged/managed variants) and forms the leaf and spine tiers of every DGX SuperPOD reference architecture from 2022 onward.
- Total switching capacity is 51.2 Tb/s per ASIC; a single rack of Quantum-2 leaves can service a 1,024-GPU pod at full bisection.
Overview#
The NVIDIA Quantum-2 is the InfiniBand switch ASIC that anchored the NDR generation. Announced in 2021 and shipping in volume from 2022, it is the silicon inside the MQM9700 family of 1U switches that fill the leaf and spine tiers of essentially every DGX SuperPOD, HGX-based hyperscale pod, and neocloud GPU fabric built between 2023 and 2026.
Quantum-2 is what made fat-tree fabrics at 400 Gb/s economically practical. A single 64-port box at 400 Gb/s replaces what previously took two HDR switches; a two-tier fat tree of Quantum-2 leaves and spines scales to 2,048 endpoints at full bisection bandwidth.
Specifications#
| Property | Value |
|---|---|
| Generation | NDR (Next Data Rate) |
| Aggregate switching capacity | 51.2 Tb/s |
| Port count | 64 × 400 Gb/s |
| Split mode | 128 × 200 Gb/s |
| Form factor | 1U appliance (MQM9700) |
| Cut-through latency | ~330 ns port-to-port |
| In-network reduction | SHARPv3 |
| Routing | Adaptive routing, SHIELD self-healing |
| Cabling | OSFP twin-port |
| Managed variants | MQM9700-NS2F (managed), MQM9790-NS2F (unmanaged) |
Architecture#
Quantum-2's defining feature beyond raw bandwidth is the third-generation SHARP (SHARPv3) engine. The ASIC contains hardware reduction units that perform sum/min/max/avg operations on packet payloads as they traverse the switch — letting AllReduce collectives complete in-network rather than at endpoints. For training loads where the collective is the critical path, the speed-up is measurable and predictable.
Adaptive routing chooses among equivalent-cost paths on a per-packet basis using switch-local congestion signals. SHIELD (Self-Healing Interconnect Enhancement for Intelligent Datacentres) lets a switch reroute around a failed link within hundreds of nanoseconds, preserving collective progress through transient link faults.
Topology Patterns#
- Single-tier: up to 64 endpoints per switch with full any-to-any bisection.
- Two-tier fat tree: 32 leaves × 32 spines yields 1,024 endpoints at full bisection.
- Three-tier fat tree: scales to ~16,000 endpoints, used in 10k-GPU SuperPODs.
- Dragonfly+ supported but less common in commercial AI deployments.
Operational Notes#
- Subnet Manager required: OpenSM or NVIDIA UFM Telemetry/Cyber-AI.
- Firmware: MLNX-OS for switches, MLNX_OFED or DOCA-Host on endpoints — keep aligned across the fabric.
- Telemetry: per-port packet counters, ECN marks, and credit-stall counters exposed via UFM; integrate into Prometheus for production observability.
- Power: a fully loaded MQM9700 draws ~750 W typical; rack PDU sizing should assume 1 kW per leaf.
References
- NVIDIA Quantum-2 InfiniBand Platform · NVIDIA
- MQM9700 Series Product Brief · NVIDIA
- DGX SuperPOD Reference Architecture · NVIDIA