Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

header-navigation

AI Data Center Network with Juniper Apstra, NVIDIA GPUs, and WEKA Storage—Juniper Validated Design (JVD)

keyboard_arrow_up
list Table of Contents
file_download PDF
{ "lLangCode": "en", "lName": "English", "lCountryCode": "us", "transcode": "en_US" }
English
keyboard_arrow_right

Validation Framework

date_range 23-Dec-24
JVD-AICLUSTERDC-AIML-02-08

Platforms / Devices Under Test (DUT)

Table 25: Platforms / Devices Under Test (DUT)

Component Frontend Storage Backend GPU Backend (Cluster 1 and 2)
Architecture 3-stage clos 3-stage clos 3-stage clos rail optimized
Spine nodes QFX5130-32CD x 2 QFX5220-32CD x 2

QFX5230-64CD x 2 (cluster 1)

PTX-10008 JNP10K-LC1201 (cluster 1)

QFX5240-64OD x 2 (cluster 2)

Leaf nodes

QFX5130-32CD x 1

( frontend-gpu-leaf )

QFX5130-32CD x 1

( frontend-weka-leaf )

QFX5220-32CD x 2

( storage-backend-gpu-leaf )

QFX5220-32CD x 2

( storage-backend-weka-leaf )

QFX5220-64CD x 8 (cluster 1 – stripe 1)

QFX5230-64CD x 8 (cluster 1 – stripe 2)

QFX5240-64CD x 8 (cluster 2 – stripes 1-2)

Leaf nodes <=>

spine node links

2 x 400GE

(per frontend-leaf <=>

frontend-spine link)

2 x 400GE

(per storage-backend-weka-leaf

<=> storage-backend-spine)

3 x 400GE

(per storage-backend-gpu-leaf

<=> storage-backend-spine)

2 x 400GE

(per gpu-backend-spine <=>

gpu-backend-leaflink)

Number of NVIDIA DGX

H100 GPU servers

2 (Cluster 2 - stripe 1)

2 (Cluster 2 - stripe 2)

Number of NVIDIA HGX

A100 GPU servers

4 (Cluster 1 - stripe 1)

4 (Cluster 1 - stripe 1)

NVIDIA DGX H100

GPU servers <=>

GPU leaf nodes links

1 x 100GE

(per gpu server <=>

frontend-gpu-leaflink)

1 x 200GE

(per gpu server <=>

storage-backend-gpu-leaf link)

1 x 400GE (Cluster 2)

(per gpu server <=>

gpu-backend-leaflink)

NVIDIA HGX A100

GPU servers <=>

GPU leaf nodes links

1 x 100GE

(per gpu server <=>

frontend-gpu-leaflink)

1 x 100GE

(per gpu server <=>

storage-backend-gpu-leaf link)

1 x 200GE (Cluster 1)

(per gpu server <=>

gpu-backend-leaflink)

Total number of GPUs

96: 32 x stripe in cluster 1

16 x stripe in cluster 2

WEKA storage servers 8

WEKA storage servers <=>

WEKA storage leaf nodes links

1 x 100GE

(per weka server <=>

frontend-weka-leaf link)

1 x 200GE

(per weka server <=>

storage-backend-weka-leaf link)

N/A

footer-navigation