Juniper & Broadcom: Why Ethernet Always Wins

Data Center400G & 800G

Juniper & Broadcom: Why Ethernet Always Wins

In this session from our Seize the AI Moment virtual event, Juniper COO Manoj Leelanivas and Charlie Kawwas, President of Broadcom, talk AI market trends, customer use cases, and common questions. They also share their vision of how Juniper and Broadcom are addressing the needs of the market.

Show more

You’ll learn

  • What networking hardware enterprise customers need for their AI infrastructure

  • What future innovations Broadcom and Juniper have in store

Who is this for?

Network Professionals Business Leaders

Transcript

0:00 [Music]

0:07 before we dive in tell us a bit about

0:09 yourself guys let's start with you ram U

0:13 my name is uh RAM valer um I run the

0:15 switching and routing uh ethernet

0:17 business at uh broadcom and the last

0:21 couple of years I've really been

0:22 fighting the ethernet versus infiny b

0:25 battle excellent Ray I'm Ray M I'm CEO

0:28 and pris analist for CG and we focus

0:31 primarily on the service Rider space and

0:33 the large Enterprise doing a lot of work

0:35 on research but economic modeling as

0:37 well ethernet it's been a Hot Topic

0:39 lately it has indeed uh so let's Jump

0:42 Right In Ram I'm going to start with you

0:45 uh at your most recent AI virtual event

0:48 uh you talked about infin man versus the

0:50 internet but I think you really set the

0:52 context properly by talking about the

0:55 importance of the network uh especially

0:58 in today's AI clusters and AI workload

1:02 and work AI applications uh can you tell

1:05 us a bit about that yeah sure um if you

1:08 think about AI right in the first thing

1:12 you have to think about is it's a

1:13 distributed computing problem and what I

1:16 mean by a distributed computing problem

1:18 is you cannot take an you know AI

1:21 workload and run it on one GPU no matter

1:25 how big your GPU is somebody today can

1:27 come and say they have the fastest GPU 2

1:29 years years from now they can come and

1:30 say they have something that's even

1:31 faster but the reality is any particular

1:35 GPU or accelerator is only as big as

1:37 what a tsmc can build or you know

1:41 Advanced packaging that you can do or

1:43 the fastest hbm you can put on but

1:45 really what you need for a machine

1:47 learning or an AI workload is many tens

1:50 of thousands of these gpus all acting

1:52 together as if they're one very very

1:54 large computer right to do that you

1:57 really have all of these to be tied

1:59 together and all of these have to be

2:00 networked together right and that's what

2:02 we mean by you know this is a

2:04 distributed computing problem when you

2:05 have a distributed computing problem the

2:07 network is what ties all of this

2:09 together and the network becomes the

2:11 computer right because anyone can build

2:13 the fastest GPU but if they cannot tie

2:16 all of these gpus together to act as if

2:18 they're one large piece of single

2:20 computer the whole thing falls apart and

2:22 that's what we mean by the network is

2:23 the computer and you know if network is

2:26 the computer and you want to build the

2:28 best network that's out there nothing

2:30 like Ethernet there was a finding from

2:32 meta which I thought was really uh uh

2:35 insightful that you highlighted in your

2:37 talk can you tell us what that finding

2:40 was about the how the network is

2:42 critical for making sure that these gpus

2:44 don't stay idle so um I I don't know if

2:47 you've seen this uh presentation which

2:49 was presented by meta I believe about a

2:52 couple of years ago at the ocp and what

2:55 they showed is um different um uh

2:58 workloads that they had different kinds

3:00 of recommendation models that they had

3:02 and how much time was spent uh in the

3:06 network the traffic going back and forth

3:09 between this gpus and it varied anywhere

3:12 from 20% to almost 57% right what it

3:16 means is there is that much amount of

3:18 time somewhere between 20 to almost 60%

3:21 of the time the gpus are sitting idle

3:24 waiting for the traffic to be shuffled

3:27 between these different gpus right now

3:29 so think about it typically these gpus

3:32 probably sell between 20 to $30,000

3:34 depending on how favored a customer you

3:36 are the vendor sometimes more a lot more

3:39 now you know and then you take those and

3:41 you're now putting together 100,000 of

3:43 this you do the math that's anywhere

3:44 between two to three plus billion

3:46 dollars in gpus right and if these

3:48 things are sitting idle that's a pretty

3:50 expensive Affair talking about a billion

3:52 dollars right like 30% yeah so if you

3:53 say 50% of the time you're sitting idle

3:55 you're sitting you know about a billion

3:58 and a half sitting idle then yeah and in

4:01 that same uh uh study meta actually they

4:04 built two different clusters one with

4:06 ethernet and we can talk about now

4:07 ethernet versus infin band look I think

4:10 you know two years ago if you talk to

4:11 anybody they said if you're building a

4:13 GPU you know cluster AI machine learning

4:17 nothing other than infinite band would

4:18 work I remember right it everywhere you

4:20 went it's like oh if it's not infin band

4:22 this is not going to work and I was

4:23 sitting there scratching my head saying

4:25 that's not true right today when you

4:27 look at it top seven seven out of the

4:32 top eight largest clusters in the world

4:35 are built based on ethernet y there is

4:38 one last remaining one that's built on

4:40 you know infin band but my take is you

4:43 know in a year year and a half from now

4:45 that will also be based on ethernet okay

4:48 so what's happened over this two-year

4:49 period is you know initially when you

4:52 get a solution that's kind of all

4:54 purpose built by the vendor and it says

4:56 okay look you cannot touch any of this

4:58 you've got the GPU you've got the

5:00 you've got these cables and you've got

5:02 the switches and all of this is

5:03 pre-engineered by us if you touch it the

5:05 thing is not going to work there is a

5:07 lot of fear uncertainty and doubt based

5:10 on which the customers who are in a rush

5:12 to deploy these systems will just take

5:14 it as somebody is giving it to them

5:16 right but then as customers start to

5:17 deploy them they'll start to find out

5:19 look operationally infin band is very

5:20 different than ethernet number two it

5:23 has a tendency to just literally you

5:25 know break down quite quite a bit more

5:27 than just ethernet has been built for

5:29 because ethernet is built under the

5:31 notion that it is going to be very

5:32 scalable reliable and so on and so forth

5:35 so customers have gone through these

5:36 experiences and said look I have to

5:38 actually Benchmark infin band versus

5:40 ethernet to see if it is worth this

5:42 hassle of maintaining this infinite band

5:45 which is very you know fragile right and

5:48 they started to test and you know uh

5:50 meta put out this paper they did over

5:51 24,000 plus gpus and they tested them

5:54 both and they found out ethernet was

5:56 pretty good it was actually in many

5:57 cases very comparable performance to

5:59 even band but with the operational ease

6:01 and reliability that you expect out of

6:03 you know ethernet so there's more and

6:05 more benchmarks that have been done

6:07 across the industry and that's why the

6:08 industry's moved on yeah and I think

6:11 there's I mean there's history here too

6:13 I mean in the past I mention when I was

6:15 a former CTO I loved ATM technology

6:18 because using slice and dice of data I

6:21 mean we had triple play back then but I

6:23 remember designing trading floors and

6:25 stuff like that and my boss came over

6:28 and said I want you to try these broker

6:30 workstations with ethernet I'm like

6:32 ethernet are you kidding me we have ATM

6:35 anywhere like that right but but then

6:38 our CFO came in hey it's a $1,200 a niit

6:41 car for an atm25 Meg and it's $69 for

6:46 100 price right and and and I was like

6:50 so I learn the economics part of network

6:52 design but initially I was concerned

6:55 about the architecture of ethernet but

6:57 it just kept getting better better speed

6:59 and that efficiency so I learned early

7:01 on never to get bet against Ethernet

7:03 from that perspective right I mean the

7:05 ubiquity of ether the ubiquity so so

7:08 from an economic standpoint maybe do you

7:10 want to comment a bit more I know that

7:11 you did a study recently right of the

7:13 economics of ethernet and and for those

7:15 don't know what we do is we have um a

7:18 software platform that's kind of like a

7:20 digital twin but it does economic

7:22 simulation modeling of whatever

7:24 architecture versus any architecture any

7:27 technology versus any technology or any

7:29 use use case or application for that

7:31 perspective so what we did in this

7:33 particular case was we model ethernet uh

7:36 against infin band and then we use a

7:38 similar architecture where we use spine

7:40 Leaf Technologies and we had a server

7:43 environment where we had the dgx 8 I

7:46 think the

7:46 h100s um and and from that perspective

7:49 we had the server environment but then

7:51 we had a compute Network that we had

7:53 infin ban but then we had in this case

7:55 the Juniper the qfx uh switches from

7:59 that and then the interconnections range

8:01 between 400 to 800 gig and honestly some

8:04 of the findings from a capex perspective

8:08 it's about 55% because even the ports on

8:11 infiniband itself are twice as expensive

8:14 as an Ethernet 50% cheaper for ethernet

8:17 compared to in 50% cheaper less than

8:19 half theost and then when we looked at

8:22 some of the other parts from the

8:23 switching cost then we looked at the

8:25 equipment cost which a lot of people

8:26 forget cables Optics that adds up over

8:29 time and there's different requirements

8:31 and then the second part is the Opex

8:34 right how much does it cost to manage

8:36 this environment and this is where we

8:38 model you know intent based automation

8:40 with abure to say how do we simplify so

8:43 the overall TCO came out to about 56%

8:46 savings over three-year time frame right

8:48 now wow so we're talking about same

8:51 performance uh better reliability and

8:55 more than half the cost but then to get

8:57 to Performance I think that in other

8:59 also critical factor is management right

9:03 debuggability

9:04 observability right in the ecosystem of

9:07 tools right um you know knowing what's

9:11 going wrong in the cluster and or

9:13 fine-tuning a cluster to make sure that

9:15 the parameters are all set properly from

9:17 a networking standpoint uh is also very

9:20 important knowing what's going on on the

9:23 you know on the link you know on the

9:25 wire is important and you know for

9:27 ethernet I you know there are hundred

9:29 and thousands of tools out there of

9:31 course we built one abstra uh now part

9:34 of juniper which works across all all

9:38 types of vendors uh but there are many

9:40 many such tools on the market yeah you

9:43 know I suspect that that's also a factor

9:45 compared to infin band I don't know how

9:46 many tools there are for to go and debug

9:49 or provide observability into into infin

9:52 bands I suspect that was factor in your

9:54 study that was a major factor I mean

9:56 there's we we focus mostly on tangible

9:58 benefits but there's a lot of intangible

10:01 benefits associated with it where when

10:03 there's only one vendor you're at the

10:05 mercy of their timeline and their

10:07 priorities not yours right so that's a

10:10 challenge right the other part is the

10:12 number of skill sets that are available

10:14 out there to be able to support

10:16 something that's uh set up this way and

10:18 stuff like that where if you look at I

10:20 talked about just the cost of the

10:22 equipment look at the ca of the skill

10:24 sets that you have to acquire you

10:26 normally when you interview an engineer

10:27 you don't ask if they have ether skill

10:30 right so I think those are the

10:32 intangibles that people aren't thinking

10:34 about how much is it to the skill sets

10:36 to maintain because that adds up to

10:38 operational costs and more importantly

10:40 I'm more concerned about business

10:42 continuity because realize we're talking

10:44 about AI but some of these models could

10:46 be used for high performance Computing

10:48 or whatever parallel processing that

10:51 requires that type of uh uh environment

10:53 and stuff like that so there's a variety

10:55 of use cases on top of AI for that yeah

10:58 I mean can you imagine that uh an

11:00 organization has multiple networks and

11:03 the more commonality there are in these

11:05 networks the better in terms of

11:07 leveraging the workforce the expertise

11:09 also there are some security aspects uh

11:11 right Ram yeah so let's talk about

11:13 security right so for example now

11:14 specifically when you think about ai ai

11:16 coming into the you know Enterprises

11:19 yeah what do Enterprises have that's

11:21 really differentiated that is their own

11:23 customer data their own analytics you

11:25 know how their whole business runs and a

11:26 lot of that is very proprietary and some

11:28 of it is so fun that they're not

11:30 necessarily maybe going to feel very

11:31 comfortable putting it outside their own

11:33 premise now so they start to build their

11:36 you know private AI Cloud so to say when

11:39 they start building it what needs to

11:40 happen is there's a lot of data that

11:42 goes back and forth between what is

11:44 stored in their Cold Storage active

11:46 storage and into this gpus doing the

11:48 data analysis crunching out some you

11:50 know coefficients and then pushing it

11:52 out so what this means is it has to come

11:54 into the natural security policies you

11:57 have already inside your Enterprise and

11:59 how things are being stored secured who

12:01 do you give access to what so and so

12:03 forth so if you build something with

12:05 something like infinite band which

12:06 doesn't have this notion of access

12:08 controls you know security and so on and

12:11 so forth you're going to be building

12:12 completely different islands and the

12:14 whole idea of building something that's

12:15 private you know into your Enterprise

12:16 and moving data back and forth is not

12:18 going to work right which is where kind

12:20 of having everything on ethernet one

12:22 common fabric one common set of policies

12:25 some one common set of access controls

12:27 right just makes all of the so much more

12:30 reason I'm actually glad you brought

12:32 that up because not enough people are

12:34 talking about the security aspects of it

12:36 because I always say security is as

12:38 strong as your weak it's link right and

12:40 then having the more distributed you

12:42 have security the harder it is to manage

12:44 and more opportunities for penetration

12:47 and you don't want to have these debt

12:48 Pockets where people don't have

12:51 understanding of what's going on in that

12:53 area so I always say no security no

12:55 business excellent um we've said that

12:58 ethernet uh provides similar performance

13:01 the to than infin band maybe tell us

13:04 about UE there is an effort right I

13:05 think it's about uh improving the

13:08 scalability of RDMA specifically right

13:11 uh do you want to tell us maybe a bit

13:12 about you know the purpose for U and how

13:16 it's going to scale ethernet to the even

13:19 further so while ethernet today does

13:22 everything that infinite band does you

13:24 know tooro can go in better performance

13:27 much higher reliability less than half

13:29 the cost we're all kind of thinking

13:31 ahead about how do you improve RDMA not

13:33 e how do you improve RDMA and that's

13:35 where U has come up with a bunch of

13:38 improvements to RDMA that allow it to do

13:41 multi- pathing so you don't assume

13:43 there's just one path between point a

13:44 and point B that there's multiple paths

13:46 then you have efficient you know

13:48 retransmits which is if packet four got

13:50 dropped but five six and seven got

13:53 transmitted we will go back and only

13:55 retransmit packet four rather than

13:56 transmitting five six and seven so built

13:59 onto the assumption that your underlying

14:01 fabric might actually fail rather than

14:03 infinity band and RDMA that said

14:05 underlying fabric will not fail right so

14:08 you build resiliency you know knowing

14:11 failures will happen otherwise it's like

14:13 hey building in you know a skyscraper in

14:15 San Francisco saying there will be no

14:17 earthquake right that's a ridiculous

14:19 assumption to make assume there's

14:20 earthquakes and retrofit the buildings

14:22 and that's what you know is excellent

14:24 there's a large ecosystem around it and

14:26 multiple uh topology options right like

14:29 where you can involve the Nick or you

14:30 may just uh work within the confines of

14:33 the the switches correct exactly yeah

14:36 excellent all right gentlemen that was

14:38 insightful and lots of fun

14:43 [Music]

Show more