TL;DR
- Fourth-generation HBM standard ratified by JEDEC in 2025; doubles per-stack interface to 2,048 bits.
- Per-stack bandwidth target of ~2 TB/s, with 16-high stacks and capacities up to 48-64 GB.
- Adopted by next-generation accelerators including NVIDIA's Rubin and AMD's MI400 families.
- Production ramp through 2025-2026 follows the now-familiar HBM cadence with SK hynix in the lead.
Overview#
HBM4 is the next major step in High Bandwidth Memory after HBM3 and HBM3e. JEDEC ratified the standard in 2025, doubling per-stack interface width from 1,024 bits to 2,048 bits — the largest single architectural change in HBM since the original specification.
The doubled interface, combined with continued pin-speed scaling, targets ~2 TB/s per stack at launch with substantial headroom for refreshes. Capacity per stack scales with 16-high configurations and improved through-silicon-via density.
Specifications#
| Metric | HBM4 (target) |
|---|---|
| Interface width | 2,048 bits per stack |
| Bandwidth per stack | ~2 TB/s |
| Capacity per stack | Up to 48-64 GB (16-high) |
| Channels per stack | 32 |
| Voltage | Lower than HBM3e |
| Adoption | Rubin, MI400, next-gen frontier |
HBM4 specifications continued to firm through 2025-2026. Treat per-stack bandwidth and capacity ceilings as production targets, not guaranteed shipping numbers.
Why a Wider Interface#
Pin-speed scaling has limits. Each generation of HBM has nudged speeds up, but the marginal cost of going from 9.2 Gb/s to higher numbers increases sharply in both power and yield. Doubling the interface width is an alternative that produces a one-time bandwidth doubling without further speed pressure.
The trade-off is interposer complexity. A 2,048-bit-wide interface requires more silicon interposer area and tighter routing tolerances; CoWoS-L and equivalent packaging technologies were qualified through 2024-2025 specifically to enable HBM4 integration.
What HBM4 Enables#
- Per-GPU memory pools exceeding 500 GB at frontier-accelerator scale.
- Decode bandwidth that keeps pace with FP4 compute throughput.
- Inference of trillion-parameter MoE models on smaller pod sizes.
- Reduction in tensor-parallel split count for long-context workloads.
Pitfalls#
- Production capacity ramps slowly — first-generation HBM4 supply will be allocated to the largest customers.
- Packaging cost increases: CoWoS-L and equivalent technologies are more expensive than HBM3-era integration.
- Validation cycles are long; per-batch qualification is normal.
- Software gains from HBM4 require configuration changes — defaults sized for smaller budgets will under-utilise capacity.
Outlook#
HBM4 is the substrate of next-generation frontier accelerators through 2026-2028. Subsequent variants (HBM4e and beyond) will continue the bandwidth-and-capacity scaling pattern HBM3 / HBM3e established.
References
- JEDEC HBM4 Standard Announcement · JEDEC
- SK hynix HBM Roadmap · SK hynix