TL;DR
- Jericho 3-AI is Broadcom's deep-buffer DNX-family switch ASIC, marketed specifically as an Ethernet fabric for AI clusters.
- Provides large per-port HBM-backed packet buffers, end-to-end scheduled fabric (VOQ — Virtual Output Queue), and per-flow congestion-aware load balancing.
- Pairs with the Ramon 3 fabric element ASIC to build distributed disaggregated chassis that scale into very large AI clusters with deterministic behaviour.
- Trade-off versus shallow-buffer Tomahawk: higher per-port cost and slightly higher latency, but tolerates less-tuned congestion behaviour and absorbs incast bursts more gracefully.
Overview#
The Jericho 3-AI is Broadcom's DNX-family answer for AI fabrics. Where the Tomahawk family pursues raw throughput and low latency with shallow on-chip buffers, the Jericho line uses external HBM to provide deep per-flow buffering and a scheduled-fabric VOQ architecture that prevents head-of-line blocking even under heavy incast.
Announced in 2023, Jericho 3-AI is the silicon behind switches positioned as 'AI-optimised Ethernet' alternatives to InfiniBand. The pitch is that the same fabric that lets a Service Provider run carrier-grade traffic without packet loss can also carry RoCEv2 training collectives with deterministic AllReduce behaviour.
Architecture Highlights#
- Deep external HBM buffers — gigabytes per device versus megabytes for Tomahawk.
- VOQ scheduled fabric: every input maintains a per-output queue, the fabric grants transmission slots, no head-of-line blocking.
- Cell-based forwarding inside the fabric (paired with Ramon 3) for predictable latency.
- Per-flow congestion-aware load balancing across fabric paths.
- Designed to interoperate as a single distributed chassis — multiple Jericho 3-AI line cards plus Ramon 3 fabric cards behave as one logical switch.
When to Choose Jericho 3-AI#
The natural fit is large multi-tenant AI fabrics where traffic patterns are less predictable than a single homogeneous training job. The deep buffers and VOQ absorb the incast bursts that would otherwise trigger PFC pauses on shallow-buffer fabrics.
Shallow-buffer Tomahawk-class fabrics remain preferred for single-tenant training pods with well-understood traffic patterns and disciplined congestion tuning — they are cheaper per port and slightly lower latency.
Operational Notes#
- VOQ fabric removes the need for aggressive PFC tuning but does not eliminate the value of ECN feedback to endpoints.
- Distributed chassis (DDC) deployments can scale into tens of thousands of 400G/800G ports as a single logical entity.
- Software ecosystem: SONiC support varies by platform; vendor NOS (Arista, Cisco, Nokia) provides richer DNX integration.
References
- Broadcom Jericho 3-AI Product Page · Broadcom
- Distributed Disaggregated Chassis Whitepaper · Broadcom
- Open Compute Project — DDC for AI · OCP