13.9 C
Canberra
Friday, January 23, 2026

Accelerating Ethernet-Native AI Clusters with Intel® Gaudi® 3 AI Accelerators and Cisco Nexus 9000


Trendy enterprises face important infrastructure challenges as giant language fashions (LLMs) require processing and shifting huge volumes of knowledge for each coaching and inference. With even probably the most superior processors restricted by the capabilities of their supporting infrastructure, the necessity for strong, high-bandwidth networking has turn into crucial. For organizations aiming to make the most of high-performance AI workloads effectively, a scalable, low-latency community spine is essential to maximizing accelerator utilization and minimizing expensive, idle sources.

Cisco Nexus 9000 Collection Switches for AI/ML workloads

Cisco Nexus 9000 Collection Switches ship the high-radix, low-latency switching cloth that AI/ML workloads demand. For Intel® Gaudi® 3 AI accelerator1 deployments, Cisco has validated particular Nexus 9000 switches and configurations to make sure optimum efficiency.

The Nexus 9364E-SG2 (Determine 1), for instance, is the premier AI networking swap from Cisco, powered by the Silicon One G200 ASIC. In a compact 2RU type issue, it delivers:

  • 64 dense ports of 800 GbE (or 128 x 400 GbE / 256 x 200 GbE / 512 x 100 GbE by way of breakouts)
  • 51.2 Tbps combination bandwidth for non-blocking leaf-spine materials
  • 256 MB shared on-die packet buffer, which is vital for absorbing the synchronized visitors bursts attribute of collective operations in distributed coaching
  • 512 high-radix structure that reduces the variety of switching tiers required, decreasing latency and simplifying cloth design
  • Extremely Ethernet prepared: Cisco is a founding member of the Extremely Ethernet Consortium (UEC) and Nexus 9000 switches are forward-compatible with rising UEC specs
Determine 1. Cisco Nexus 9364E-SG2: Optimized for scalability and open connectivity, supporting Intel®️ Gaudi®️ 3 AI accelerator deployments

The Intel Gaudi 3 AI accelerator addresses the necessity for scalable, open AI programs. It was designed to offer state-of-the-art knowledge heart efficiency for AI workloads, together with generative functions like LLMs, diffusion fashions, and multimodal fashions. The Intel Gaudi 3 accelerator demonstrates important enhancements over earlier generations, delivering as much as 4x AI compute efficiency for Mind Floating Level 16-bit (BF16) workloads and a 1.5x improve in reminiscence bandwidth in comparison with the Intel Gaudi 2 processor.

A key differentiator is its networking infrastructure: every Intel Gaudi 3 AI accelerator integrates 24 x 200 GbE Ethernet ports, supporting large-scale system growth with customary Ethernet protocols. This strategy eliminates a reliance on proprietary networking applied sciences and supplies 2x the networking bandwidth in comparison with the Intel Gaudi 2 accelerator, enabling organizations to construct clusters from a number of nodes to a number of thousand seamlessly.

An built-in answer with excessive efficiency, scalability, and openness

Cisco Nexus 9364E-SG2 switches and OSFP-800G-DR8 transceivers are licensed to assist Intel Gaudi 3 AI accelerators in scale-out configurations for LLM coaching, inference, and generative AI workloads.

Key technical highlights of the validated structure embrace:

  • Excessive-speed and non-blocking connectivity: 256 x 200 Gbps interfaces on Cisco Nexus 9364E-SG2 switches permit high-speed and non-blocking community design for interconnecting Intel Gaudi 3 accelerators
  • Lossless cloth: Full assist for RDMA over Converged Ethernet model 2 (RoCEv2) with Precedence Circulation Management (PFC) prevents packet loss attributable to congestion, thereby enhancing the completion instances of distributed jobs
  • Simplified operations: Nexus Dashboard permits configuring Intel Gaudi 3 AI accelerators for scale-out networks utilizing the built-in AI cloth kind. It additionally provides templates for additional customizations and a single operations platform for all networks accessing an AI cluster.

Cisco Clever Packet Circulation to optimize AI visitors

AI workloads generate visitors patterns not like conventional enterprise functions—huge, synchronized bursts, “elephant flows,” and steady GPU-to-GPU communication that may overwhelm standard networking approaches. Cisco addresses these challenges with Cisco Clever Packet Circulation, a complicated visitors administration framework constructed into NX-OS.

Clever Packet Circulation incorporates a number of load balancing methods designed for AI materials:

  • Dynamic load balancing (flowlet-based): Actual-time visitors distribution primarily based on hyperlink utilization telemetry
  • Per-packet load balancing: Packet spraying throughout a number of paths for optimum throughput effectivity
  • Weighted Value Multipath (WCMP): Clever path weighting mixed with Dynamic Load Balancing (DLB) for uneven topologies
  • Coverage-based load balancing: Assigns particular traffic-handling methods to blended workloads primarily based on ACLs, DHCP markings, or RoCEv2 headers, creating custom-fit effectivity for numerous wants

These capabilities work collectively to attenuate job completion time—the vital metric that determines how rapidly your AI fashions prepare and the way effectively your inference pipelines reply.

Unified operations with Nexus Dashboard

Deploying and working AI infrastructure at scale requires visibility and different options that go far past conventional community monitoring. Cisco Nexus Dashboard serves because the centralized administration platform for AI materials, offering end-to-end RoCEv2 visibility and built-in templates for AI cloth provisioning.

Key Cisco Nexus Dashboard operational capabilities embrace:

  • Congestion analytics: Actual-time congestion scoring, Precedence Circulation Management and Express Congestion Notification (PFC/ECN) statistics, and microburst detection
  • Anomaly detection: Proactive identification of efficiency bottlenecks with advised remediation
  • AI job observability: Finish-to-end visibility into AI workloads from community to GPUs
  • Sustainability insights: Power consumption monitoring and optimization suggestions

“AI at scale calls for each compute effectivity and high-performance AI networking cloth. Intel® Gaudi® 3 AI accelerator mixed with Cisco Nexus 9000 switching delivers an optimized, open answer that lets prospects construct at scale LLM inference clusters with uncompromising cost-efficient efficiency.”
—Anil Nanduri, VP, AI Get-to-Market & Product Administration, Intel

A scalable, compliant, future-ready infrastructure

Cisco Nexus 9000 switches paired with Intel Gaudi 3 AI accelerators present enterprises with a safe, open, and future-ready community and compute atmosphere. This mix of applied sciences permits organizations to deploy scalable, high-performance AI clusters that meet each present and rising workload necessities.

 

For extra data or to judge how this reference structure could be tailor-made to your group’s wants, see specs for Cisco Nexus 9300 Collection Switches and Intel Gaudi 3 AI accelerators.

Further sources:

1 Intel, the Intel brand, and Gaudi are logos of Intel Company or its subsidiaries.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

[td_block_social_counter facebook="tagdiv" twitter="tagdivofficial" youtube="tagdiv" style="style8 td-social-boxed td-social-font-icons" tdc_css="eyJhbGwiOnsibWFyZ2luLWJvdHRvbSI6IjM4IiwiZGlzcGxheSI6IiJ9LCJwb3J0cmFpdCI6eyJtYXJnaW4tYm90dG9tIjoiMzAiLCJkaXNwbGF5IjoiIn0sInBvcnRyYWl0X21heF93aWR0aCI6MTAxOCwicG9ydHJhaXRfbWluX3dpZHRoIjo3Njh9" custom_title="Stay Connected" block_template_id="td_block_template_8" f_header_font_family="712" f_header_font_transform="uppercase" f_header_font_weight="500" f_header_font_size="17" border_color="#dd3333"]
- Advertisement -spot_img

Latest Articles