Juniper Networks Demonstrates Path Diversity and Low Latency Routing with Paragon Automation
![Still image shows a box with a headshot with the name Anton Elita across the top, on the left hand side and his title underneath but it is too blurry to read. In the middle of the screen there is a TechField Day logo with these words underneath “SHOWCASE Presented by: Juniper Networks, The Autonomous Transport Network Key Success Factors.” On the right hand side is a box with a headshot with the name Cyril Doussau on the top and his title below his image but it is too blurry to read.](https://i.ytimg.com/vi/AoYNSd-CKvw/hqdefault.jpg)
Catch a deep-dive session on the Network Optimization piece of the Paragon Automation suite. This includes is a live demonstration.
You’ll learn
Path diversity
Low latency routing
Who is this for?
Host
![Cyril Doussau Headshot](/content/dam/www/assets/mediaportal/speakers-hosts/cyril-doussau.jpeg/jcr:content/renditions/cq5dam.web.1280.1280.jpeg)
Guest speakers
![Peter Welcher Headshot](/content/dam/www/assets/mediaportal/speakers-hosts/2023/peter-welcher.jpg/jcr:content/renditions/cq5dam.web.1280.1280.jpeg)
![David Penaloza Seijas Headshot](/content/dam/www/assets/mediaportal/speakers-hosts/2023/david-seijas.jpg/jcr:content/renditions/cq5dam.web.1280.1280.jpeg)
![Steve Puluka Headshot](/content/dam/www/assets/mediaportal/speakers-hosts/2023/steve-puluka.jpg/jcr:content/renditions/cq5dam.web.1280.1280.jpeg)
Transcript
0:09 I'm antonelita a technical solution
0:13 consultant at Junior par networks we
0:15 have been talking about Paragon
0:18 automation suite and its applications in
0:21 the network
0:22 from planning orchestration Assurance up
0:26 to optimization
0:27 and today we'd like to go to a deeper
0:31 dive into the last part which is
0:33 optimization in live Network
0:36 right with the first first case which is
0:41 path diversity some customers have
0:45 business requirement
0:47 to um
0:49 to provide a really diverse label switch
0:52 tasks to avoid a single point of failure
0:55 in a network
0:57 um
0:58 not relying on faster route techniques
1:01 and those customers who typically as
1:03 well request a bidirectional corouted
1:05 LSP
1:07 so that the forward and reverse paths
1:10 are sticking to the same set of links
1:12 and nodes
1:14 in such circumstances it is basically
1:17 required to have a central controller
1:19 with a Global Network View
1:20 just because if you look at this diagram
1:24 an English PE like pe1 has no idea of
1:29 LSPs which are instantiated from another
1:32 English B like pe2
1:34 only a controller with global view would
1:38 be able
1:39 to provide true diverse LSPs which have
1:43 been started from different Ingress PES
1:48 I will switch to the network view now
1:52 so um this is our base example Network
1:57 with a few nodes and I will show now the
2:01 link labels
2:03 according to the Isis metrics
2:06 we have recreated two
2:10 different tunnels one is going from
2:13 Amsterdam to Berlin
2:16 and another one is going from Brussels
2:18 to Prague
2:20 so they start and end on different nodes
2:23 in the network and due to the metrics
2:26 that we have in this example Network
2:28 they are crossing the same middle point
2:32 in Hamburg
2:33 so this is a single point of failure in
2:36 case something happens there both LSPs
2:40 will need to be rerouted or will go down
2:43 for a certain period of time
2:46 so how to avoid this happening
2:50 um we could of course do the
2:52 provisioning from uh from the controller
2:54 so we have here specific tabs here to
2:58 provision diverse tunnels
3:00 but I would like as well to um touch
3:02 upon
3:03 um
3:04 automatic uh or apis so
3:09 we have in Pathfinder Northbound
3:11 interface using rest
3:14 and if we were to program those LSPs
3:19 automatically we would use this
3:21 interface to push the request to
3:24 Pathfinder and I will now show
3:27 um such a rest client
3:32 um I will first need to authenticate
3:34 myself with Pathfinder
3:36 um I will get a token as a reply from
3:39 Pathfinder and I could use this token in
3:42 my programmatic apis and Pathfinder
3:49 a rest call
3:52 and this uh
3:54 content is formatted in Json
3:57 and the contents says which LSP is to
4:00 create their name configuration method
4:03 is psap
4:05 and we have endnotes
4:08 and we have properties for each LSP
4:10 important are of course
4:13 diversity level uh diversity group and
4:16 that we want to create a core routed
4:18 pair ID
4:20 and the same goes for the other pair of
4:23 LSPs that we are going to signal
4:26 so now I'm really triggering
4:28 the
4:31 Ascent button
4:32 and I already received the reply from
4:34 Pathfinder in the bottom part of the
4:36 screen
4:38 mirroring basically my request and with
4:41 some few attributes added like admin
4:43 status and some others
4:46 so now it is safe to switch back to the
4:49 network View
4:51 which suggests that we have new network
4:55 events
4:56 I will refresh the LSP table to show the
5:01 Recently Added LSPs
5:04 and these are selected now so Brussel to
5:07 Prague and it's a reverse direction from
5:11 Prague to Brussels are taking exactly
5:14 same
5:16 links and nodes throughout the network
5:19 and same happens to the second pair of
5:22 bidirectional connected LSPs between
5:24 Amsterdam and Berlin
5:26 but if I select all four of them
5:28 this will show me that they are indeed
5:32 truly diverse
5:33 to each other and not crossing any
5:36 single point of failure in this network
5:39 so with this we show that using a
5:43 controller with a Global Network view
5:45 allows establishing maintaining path
5:48 diversity even if you have a requirement
5:51 to have a bidirectional routed LSPs I
5:56 just want to confirm that this system is
6:00 also able to take into account things
6:03 like shared risk link groups and
6:06 um and coloring that can be done that
6:09 can label basically underlying physical
6:12 shared infrastructure as opposed to The
6:14 Logical one
6:17 yes really great comments so we we
6:20 indeed have a possibility to take into
6:23 consideration from the large site where
6:27 we have multiple nodes and all the way
6:30 down through a single link and a shared
6:33 with link groups including nodes and
6:37 links
6:38 let's say that
6:40 um your network is
6:42 um
6:43 significantly larger than this and you
6:47 need to have an explicit ero that
6:50 transits the entire network that exceeds
6:53 the maximum uh
6:55 label depth of the hardware
6:59 does this platform support things like a
7:03 binding Sid to create a
7:06 you know a longer ero than let's say
7:09 whatever 12 or whatever the maximum
7:12 segment depth or maximum label depth is
7:15 yes so so this is indeed um a question
7:19 um for uh many service providers where
7:21 the number of hops might exceed
7:23 um the hardware uh possibilities of
7:26 Ingress nodes
7:27 um for this we have foreseen a few
7:30 Solutions one of them would be um using
7:34 label compression so
7:36 um Pathfinder would is able to create
7:38 LSPs consisting of as much as a single
7:42 uh label if if you don't want to stick
7:44 with uh
7:45 um with specific nodes but with your
7:48 requirement if your requirement is to go
7:50 through a certain
7:53 um certain segments in the network then
7:55 indeed we can leverage binding seats
7:58 this is supported in Pathfinder to
8:00 create a smaller label Stacks
8:03 um so that the transit node is
8:05 uncompressing this binding seed and
8:08 sending it over the next list of
8:11 segments
8:12 all right so essentially you know
8:14 stitching two LSPs together but they
8:16 look like one
8:18 is really kind of the
8:20 high level explanation of what I've
8:22 asked for the other question I have is
8:25 for LSP failover
8:29 um do you have a support for running
8:33 spfd inside the LSPs and does this
8:37 support that signaling
8:40 oh yeah yes so uh we support a seamless
8:44 bft for every provisioned LSP
8:48 um and these support has been as well
8:51 proven at the latest entc meeting we had
8:54 in um in February this year with um
8:57 other vendors as well
8:59 and this is kind of relating around
9:01 recent events you know one of the things
9:02 in the last couple years as we saw the
9:04 the rise of bandwidth and the saturation
9:06 of bandwidth with the pandemic one of
9:08 the things I noticed since you have
9:09 Europe up here specifically is that the
9:13 the the background infrastructures of
9:14 Europe and the United States are very
9:15 different United States depends on a lot
9:17 of caching points that are very close to
9:19 the provider edges in Europe there's a
9:21 lot of pnis and it's you know largely
9:23 dependent more on bandwidth than pmi's
9:25 than caching when you get them to a more
9:27 complex topology and scenario how does
9:29 that how does that scale as far as being
9:30 able to Monitor and react I know that
9:32 you know in the beginning of the
9:33 pandemic we saw a lot of pnis saturated
9:36 as we were going across uh Eastern
9:38 Europe Central Europe into Western
9:39 Europe and a pack of loss and things
9:41 like that if I'm managing this if I'm a
9:43 you know tier one Transit provider you
9:45 know how would I use this to scale to be
9:47 able to manage reacting to those kinds
9:49 of challenges on a large scale so we
9:52 have automated congestion avoidance in
9:54 the network which will be presented uh
9:56 in a few minutes by my colleague Julian
9:59 okay so um I'm switching to the um next
10:04 um use case we are going to show today
10:07 and which is um low latency routing
10:11 here the business requirement is uh to
10:14 provide the lowest latency or maybe even
10:16 guaranteed lowest latency for critical
10:18 Services
10:19 um with um even including some service
10:22 level agreements
10:24 modern networks have a really a mixed
10:28 speed links
10:30 which are participating in such networks
10:33 variety probably from 10 to 400 gig off
10:37 with all possible speed variations
10:39 so multiple service providers would
10:42 um base their metrics not on the delay
10:45 but they would use for example bandwidth
10:49 as a reference for the metrics then it
10:52 is not optimal for this business this
10:55 business requirement because the higher
10:57 bandwidth path is not always the lowest
11:00 delay
11:01 so how to solve this premium requirement
11:06 without rebuilding the whole network
11:08 metric system
11:10 so we have here um a solution comprising
11:15 multiple components so first we need to
11:18 measure the latency on each Network
11:20 segment this is this is obvious
11:22 um and then we need to distribute this
11:24 information to the controller and let
11:27 the controller
11:28 um find the lowest delay path uh having
11:32 the sum of all of the delays on every
11:35 participating Network segment
11:38 but on top of this we want to make sure
11:40 that we understand that our customer is
11:45 IP using the network and how to measure
11:49 this user experience
11:51 um this is the big question and we have
11:53 an answer to this with uh
11:56 simulating customer traffic and over
11:59 service providers Network so that we
12:02 really see the experience that a normal
12:05 user would have
12:09 um I'm switching back to the network
12:11 View and I will change the link labels
12:15 to show the measure DeLay So we have
12:19 here um on multiple links Dynamic delay
12:23 measurement I like on the selected link
12:25 I'm seldom to Frankfurt some other links
12:28 might have static value for the delay
12:31 measurement
12:33 um I would like to um to focus today
12:38 um on this Crosslink just because we
12:40 have an impairment tool which would make
12:43 this latency look much worse but before
12:47 I impair it
12:49 um I would like to review a few tunnels
12:52 a few LSPs which are crossing this link
12:56 for example one of them
12:58 um starting with the name LL for low
13:00 latency I have selected it now and it
13:03 goes from Amsterdam to Brussel but it is
13:07 crossing a node in Frankfurt just
13:09 because the direct link I'm turning to
13:12 Brussels has a higher latency of 15
13:14 milliseconds compared to Total latency
13:18 of just a little Beyond three
13:21 milliseconds
13:22 going over Frankfurt
13:25 so um before I explain how this data
13:29 gets into a Pathfinder I will switch to
13:32 an impairment tool and we'll start an
13:37 impairment on this link
13:43 so
13:47 while the impairment tool is uh doing
13:49 its job
13:51 um Let Me Explain how we get this data
13:53 so first um the two adjacent nodes like
13:57 comes to them in Frankfurt
13:59 um they sent
14:00 um so-called two-way active measurement
14:02 protocol T1 flight probes across the
14:05 link to each other
14:07 to be able to very precisely measure the
14:11 latency on the link so it is measured in
14:13 microseconds
14:14 and then this information is uh being
14:18 propagated to the igp like Isis or ospf
14:22 and from there for each Network domain
14:26 we export this data along with other
14:29 traffic information
14:31 um
14:32 we export it to a central controller
14:35 like python Pathfinder and then we are
14:38 able to figure out what is the lowest
14:41 layaway from Pathfinder and all what
14:44 remains is to use a piece of protocol to
14:47 Signal an LSP or change a path of an
14:50 existing LSP
14:53 um if you looked um at the screen while
14:56 I was talking you probably already
14:57 noticed that this Crosslink already has
15:00 some average delay increased from um
15:03 from the value of sub 1 milliseconds to
15:06 um 35 uh or something
15:09 milliseconds in average for the um last
15:12 period of time of measurement
15:15 so um
15:17 how do we know that this increase in
15:20 measurement does
15:24 um just introduce any problem to our
15:27 customers for this we use Paragon active
15:30 Assurance to inject
15:33 uh synthetic probes which are mimicking
15:36 customer traffic
15:39 um for here here I have a set of
15:43 low latency probes which are using the
15:46 customer vpns all around the network
15:49 and you probably already see that um the
15:54 green bar which is showing the quality
15:56 of our service for the last 15 minutes
16:00 this is the selected interval has turned
16:03 from Green
16:05 um to First red and then to black color
16:08 let me explain what um these colors mean
16:12 so um this is a drill down view of the
16:16 same active probe we had previously a
16:21 value of delay which is uh according to
16:25 the SLA um with this customer and then
16:28 after introducing impairment the delay
16:31 jumped up to a 50 millisecond which is a
16:35 breaching
16:36 um the contract and then this value is
16:40 considered to be equal to an outage
16:43 because it is way higher than we
16:46 promised
16:48 but you probably already noticed that um
16:51 after some time the delay went uh down
16:54 back to a couple of milliseconds so let
16:58 us see what happened and why the delay
17:01 turned back to normal
17:06 so I'm switching back to our Network
17:08 View
17:10 so we already saw that Pathfinder has
17:12 received the updated delay information
17:15 from the network and it reflected it
17:18 even in the user interface and but what
17:22 happened in the background
17:24 uh Pathfinder has for this demo
17:27 aggressive LSP optimization timer
17:30 this timer reviews the delays in the
17:33 network and looks for a delay sensitive
17:37 LSPs and can automatically without human
17:40 intervention
17:42 reroute them to A New Path
17:45 and this is exactly what happened to our
17:47 example LSP that we saw a bit earlier
17:49 instead of going Amsterdam Frankfurt to
17:52 Brussels it now takes the direct path
17:54 from Amsterdam to Brussels just because
17:57 the latency on that link is 15
17:59 millisecond which is way lower compared
18:02 to the sum of latencies
18:04 on the path via Frankfurt and to be sure
18:08 that we are looking at the same LSB we
18:11 could check the events what happened to
18:14 this um LSP in the last period of time I
18:17 will select some value in the past so
18:20 this is exactly what we saw when we
18:22 started our demo and I can compare
18:25 visually compare
18:27 um the LSP path as of now so I have
18:30 selected the latest LSP update and we
18:33 clearly see that the change was exactly
18:37 as we noticed um earlier
18:41 so with this we have reviewed a very
18:44 demanded use case for a low delay
18:47 service placement
18:49 continuous measurement of customer
18:51 experience as well as automated LSP
18:55 optimization in a changing Network
18:57 environment and gives an operator very
19:01 powerful tool to provide best-in-class
19:03 service for their customers so how do
19:07 you
19:07 how do you account for uh failures in
19:09 the mpls data plane so you know
19:11 something you know it's really common on
19:13 you know any kind of equipment is you'll
19:14 have a you know you'll have a route that
19:16 gets pushed into
19:18 um the mpls forwarding plane forwarding
19:20 database you know and you've got an LSP
19:22 but the Asic and the table are out of
19:25 sync and you don't actually forward so
19:27 you're looking at this obviously you
19:29 have an LSP you think your LSP is good
19:31 you think that you're going to move
19:31 traffic to that LSP but it doesn't
19:34 actually work because of you know a bug
19:36 and the A6 out of sync with the
19:37 forwarding table how would you handle a
19:38 condition like that with the controller
19:40 is that something that is part of the
19:42 monitoring
19:43 yes yes so this is uh this is very uh
19:46 tough use case
19:48 um to to to to find the culprit for it
19:50 um and for this we could address
19:53 um with with two approaches or basically
19:55 a combination of um two things together
19:57 one would be um as shown uh previously
20:01 the active Assurance probes which uh in
20:03 timely manner can um see that the
20:06 traffic is being black holed and uh it
20:09 can trigger
20:10 um additional automated checks on
20:13 Paragon insights so Paragon insights uh
20:16 might start you know preparing for the
20:18 network operator which will troubleshoot
20:21 it uh afterwards a set of tests like for
20:25 example Trace routes
20:27 collecting interface information right
20:29 at the moment where it was triggered
20:31 this is what regards to active
20:34 monitoring but we have as well with
20:36 insights our passive monitoring which
20:39 should be run then continuously and
20:41 which would react on uh for example
20:44 increased counters of
20:47 traffic drops on the forwarding plane
20:51 just because the equipment usually is
20:53 able to you know to account for the
20:55 packets dropped for no reason like for
20:57 example no route to the destination and
21:00 then if we have this mismatching
21:03 programming of the data plane and in the
21:06 state of the routing tables and then we
21:09 would probably push some data
21:11 increasingly towards black hole
21:14 destination and we will see a high
21:16 increase of such counters in our
21:19 monitoring tools so there are really
21:21 many counters we could monitor with this
21:24 um and this is how we can tackle this
21:26 city iteration
21:28 if for any reason
21:31 then
21:32 there will be a change triggered by the
21:34 controller and given the telemetry
21:37 well insights and all the the mapping of
21:41 the information the controller is
21:43 realizes or not or notices that there is
21:46 a
21:47 detrimental
21:48 effect of that change is the controller
21:51 going to roll it back
21:54 do you need a user confirmation or
21:56 administrator confirmation for that for
21:58 instance
21:59 uh so so this is a truly configurable of
22:03 course so we we understand that uh
22:06 closed loop automation uh is is the way
22:09 uh to the Future uh but it we cannot uh
22:13 take it um right from day one and and
22:16 set to boil the ocean with um everything
22:19 um fully automated so the trust will be
22:21 gained uh you know uh step by step so
22:24 today most operators would probably
22:27 um trust a Pathfinder to do the
22:29 rerouting as shown in this demo uh for a
22:33 fully
22:33 um you know automated set of actions
22:36 like changing configurations uh maybe
22:38 rolling back and doing you know
22:40 artificial intelligence we need time to
22:43 um you know for to gain distrust uh but
22:47 we are on a good way here so for this we
22:49 have already some artificial
22:52 intelligence bits included in our
22:54 Paragon insights uh which would um help
22:57 us to to get there so we hope that uh
23:00 our showcase did provided some light on
23:02 what we can achieve today with a proven
23:04 Cloud native automation stack so if you
23:07 want to continue investigating uh the
23:10 those technology you have a two options
23:13 to suggestion one without have an array
23:16 that actually goes through the benefits
23:18 of implementing such technology and that
23:20 qualify those benefits or you can simply
23:23 ask for Pilots that's all that was shown
23:27 today is proven technology and that
23:30 technology that has been deployed by
23:32 serious fighter around the world so we
23:35 like as well to us thank our delegates
23:37 with all their most relevant questions
23:40 and we hope to hear from you and soon
23:44 thank you