Optimizability
The previous chapters have looked at various use cases for introducing segment routing. This chapter looks specifically at the traffic engineering (TE) aspects of SR. In particular it describes:
The TE process model
The role of explicit path definition
Routing constraints and several salient properties of SR SID types
Distributed and centralized path computation
And finally, an end-to-end SR-TE solution for the sample network leveraging a centralized controller.
The TE Process Model
In the TE process model provided in Figure 1, the operator, or a suitable automaton, acts as a “controller” in an adaptive feedback control system. This system includes:
A set of interconnected network elements (IP/MPLS network)
A network state/topology monitoring element
A network performance monitoring element
A network configuration management element
The operator/automaton formulates a control policy, observes the state of the network through the monitoring system, characterizes the traffic, and applies control actions to drive the network to an optimal state. This can be done reactively, based on the current state of the network, or proactively, based on a predicted future state.
The adaptive feedback loop may implemented in each node in the network, i.e. distributed TE, or in a centralized manner, i.e. using a Path Computation Element (PCE). Furthermore, various hybrid models are possible.
Explicit Routing: A Tool for TE
Explicit routing may be desired to optimize network resources or provide very strict service guarantees but it is not by itself a TE solution. The result of the TE process model, the important part of a traffic engineering solution, likely requires the programming of an explicit route to program the resulting computation. It is therefore desired to be able to define a strict path across a network for one or more LSPs. Both RSVP-TE and SR provide a means to explicitly define strict paths (strict hops, loose hops and/or abstract/anycast hops) across a network.
Explicit Routing with RSVP
Network operators request, typically through configuration, LSPs that meet specific constraints. For example, a network operator could request an LSP that originates at Node R1, terminates at Node R6, reserves 100 megabits per second, and traverses blue interfaces only. A path computation module, located on a central controller – such as the PCE – or on the ingress router, computes a path that satisfies all of the constraints. Figure 3 illustrates the resulting RSVP path setup and reserve messaging to set up the LSP and the resulting MPLS forwarding table label operations.
The following configuration example illustrates the required configuration. It’s worth noting that there are no routing constraints defined for the LSP. In other words, the only configuration shown is the definition of the explicit path. In many (or most) RSVP-TE deployments the paths are not explicitly defined but instead a routing constraint is defined, such as bandwidth or link affinity, such that the ingress node or external CSPF computes the explicit path dynamically to meet the routing constraint.
R1 configuration
Explicit Path Definition with SR
Much like RSVP-TE explicit LSPs, a network operator will request, typically through configuration, LSPs that meet specific constraints. The main difference is not only the specific configuration syntax, though they are quite similar, but the concept of describing the ‘routing constraint’ using the same CLI constructs. To provide complex TE constraints, an ingress SR node may need to rely on an external controller or PCE.
Using the same example as in the RSVP-TE example, a network operator could request an LSP that originates at Node R1, terminates at Node R6, reserves 100 megabits per second, and traverses blue interfaces only but would require an external controller/PCE to compute the path. An external PCE is required only because of the specified bandwidth constraint as SR doesn’t signal its reservations. If bandwidth is not required, a distributed path computation completed by R1 would suffice for the SR path computation. Figure 4 illustrates the resulting SR path and MPLS forwarding table label operations. Note that there is a fairly significant difference in the label forwarding operations for a SR-TE LSP compared to the SPF SR LSP and the previous RSVP-TE LSPs.
SR adjacency SIDs are dynamically allocated by default. Because the next configuration example illustrates using traditional CLI techniques for describing the explicit path, where each hop in the path is specified by its label/SID, it is recommended that labels are pre-planned and statically assigned so that each link has a unique label/SID, much like how IP addressing is handled. It’s worth noting that an SR-TE path can also be described, via CLI, using traditional IP addresses as the specified hops in the path and allowing the ingress router to resolve the SID and label stack.
R1: defining static adjacency SIDs
R1: defining explicit SR tunnels
As mentioned above, you may use the auto-translate option and describe the path as a series of IP hops, like an RSVP-TE path, and Junos will translate the SIDs. This has a nice advantage of removing the need to statically configure adjacency SIDs as well, ensuring a controller, who may have calculated the path, is not required to change the path in the event of a link transitioning down.
Loose Hops, Prefix-SIDs, and Anycast-SIDs
Segment Routing SIDs
A segment identifier (SID) identifies each segment. Network operators allocate SIDs using procedures that are similar to those used to allocate private IP (i.e., RFC 1918) addresses. Every SID maps to an MPLS label with SR-MPLS. SR-capable routers advertise the SIDs via the IGP. The IGP floods this data, in addition to the previously mentioned TE link attributes described above, throughout the IGP domain. Therefore, each node within the IGP domain maintains an identical copy of a link-state database (LSDB) and a traffic engineering database (TED). The following segment types are the most common and will be used in this chapter to describe several relevant use cases.
Adjacency: Adjacency segments represent an IGP adjacency between two routers. Junos allocates adjacency SIDs dynamically but they can also be statically configured. As previously mentioned this may be useful in some scenarios:
Prefix: Prefix segments represent the IGP least cost path between any router and a specified prefix. Prefix segments contain one or more router hops. A node SID is also a type of prefix SID:
Anycast: Anycast segments are like prefix segments in that they represent the IGP least cost path between any router and a specified prefix. However, the specified prefix can be advertised from multiple points in the network. Note that in the example, the CLI shows only a single node announcing the anycast-SID. In an actual deployment all nodes that are part of the anycast group would advertise the same anycast-SID.
Binding: Binding prefixes represent tunnels in the SR domain. The tunnel can be another SR-Path, an LDP-signaled LSP, an RSVP-TE signaled LSP, or any other encapsulation:
The Critical Role of Maximum SID Depth
Maximum SID depth (MSD) is a generic concept defining the number of SID’s LSR hardware and software capable of imposing on a given node. It is defined in various IETF OSPF/ISIS/BGP-LS/PCEP drafts. When SR paths are computed, it is critical that the computing entity learn the MSD that can be imposed at each node or link for a given SR path. This ensures that the SID stack depth of a computed path does not exceed the number of SIDs the node is capable of imposing.
Setting, Reporting, and Advertising MSD
When using PCEP to communicate with a PCE for LSP state reporting, control, and provisioning, the MSD is reported. The following CLI can be used to increase the reported MSD value of 5:
While this reports MSD to a controller via a control plane protocol, Junos also requires that ingress interfaces that will be imposing labels also have their label imposition increased from the default value of 3:
SID Depth Reduction
Because MSD is such a critical concern in many SR scenarios, the idea that the end-to-end path need not be completely specified is a very popular idea. In the previous examples, the example CLI specified the individual hops of the SR-TE LSP (and the RSVP-TE LSP) and can be referred to as strict. A strict hop means that the specified LSR must be directly connected to the previous hop. A loose hop, on the other hand, means that the path must pass through the specified LSR, but the LSR does not have to be directly connected to the last hop; any valid route between the two can be used. Let’s look at an example using the topology illustrated in Figure 5.
Let’s say you want an LSP to go from R1 to R6 via the lower route of R3, R4, and R5. To achieve this goal you merely need to specify R3 as the first hop in the path and then let normal routing take over from there since the path from R3 to R6 via R5 has an IGP metric of 30 while the path via R1 has an IGP metric of 40. The resulting label stack for the SR-TE LSP is only two labels deep (node-SID for R3 and node-SID for R6) instead of four as previously seen:
You can equate this, and the previous examples, to the MPLS version of a static route. Like static routes, manually configured paths are useful when you want explicit control, but also like static routes, they pose an administrative problem when you need to establish many paths and/or want the network to dynamically react to new events such as a link coming Up or going Down.
Special care must be taken when explicitly configuring loose hops for a path, RSVP or SR. Link failures can result in unexpected forwarding behavior. For example, let’s say that the link between R1 and R3 fails, as illustrated in Figure 6. Since the SR-TE LSP is a static instruction set and the node SID specified in the path is still reachable on the network, the resulting LSP’s path will be suboptimal, yet valid.
Binding SIDs represent another option for reducing the label stack depth when configuring explicit paths, as shown in Figure 7. Using the same simple example topology, an SR-TE LSP could be created from R3 to R5 and advertised with a binding-SID such that the ingress LSP from R1 to R6 referenced the binding SID in its path and the resulting label stack is only two deep.
R3 to R5 SR-TE LSP with binding-SID = 2000
And then the SR-TE LSP from R1 to R6 would look like:
Anycast SIDs represent another SID type used to create some interesting forwarding behaviors in SR networks. Since multiple nodes, comprising the anycast group, announce the same prefix SID, an ingress or transit node will forward towards the closest, from an IGP metric perspective, node announcing the prefix SID. This enables an operator to introduce load balancing and high availability scenarios that are somewhat unique to SR networks. Again, using our simple network topology shown in Figure 8, let’s assume that R4 and R5 announce the same anycast SID, 405. You can now create an ingress LSP from R1 to R6 that results in equal cost load balancing between the R1, R2, R5, R6 path and the R1, R3, R4, R5, and R6 paths as shown.
Again, while not the primary goal of using anycast SIDs, the label stack has been reduced by specifying the anycast SID as a hop in our SR-TE LSPs path:
Another way for the ingress LSR to determine the path is to dynamically calculate it. Just as OSPF and IS-IS use a Shortest Path First (SPF) algorithm to calculate a route, the ingress LSR can use a modification of the SPF algorithm called Constrained Shortest Path First (CSPF) to calculate a path. Let’s explore dynamic path calculation next.
Dynamic Path Calculation and Routing Constraints
Thus far we have looked at various ways of defining a non-shortest path or explicit path using the various types of SIDs that SR offers along with a few special considerations. But defining and managing tens, hundreds, thousands, or tens of thousands of static SR paths is unscalable. We must therefore look back to the TE process model and explore how to define ‘simple’ routing constraints that describe how SR-TE LSPs paths should be computed. In other words, the explicit paths are not TE but rather a means to enable TE. First, let’s look at information distribution or state/topology monitoring.
Link and Node Attributes and Routing Constraints
The most important requirement for TE is the dissemination of link and node characteristics. Just as SIDs are reliably flooded throughout a routing domain, link and node characteristics along with TE-oriented resource availability are also flooded throughout a TE domain using extensions to the IGPs. Enabling the use of link-state routing protocols to efficiently propagate information pertaining to resource availability in their routing updates is achieved by the addition of extensions to the link-state routing protocol. Link-state routing protocols manage not only the flooding of updates in the network upon link-state or metric change, but also bandwidth availability from a TE perspective. These resource attributes are flooded by the set of routers in the network to make them available to head end routers for use in TE tunnel LSP path computation (dynamic tunnels).
Link-state announcements carry information lists that describe a given router’s neighbors attached networks, network resource information, and other relevant information pertaining to the actual resource availability that might be later required to perform a constraint-based SPF calculation. OSPF and IS-IS have been provided with extensions to enable their use in an MPLS TE environment to propagate information pertaining to resource availability and in dynamic LSP path selection. These link and node attributes take the form of link colors, shared risk link group associations, available bandwidth, and metric types (TE or IGP) to name a few. These attributes are available in the TED to be used by the computing entity. Again, using our simple example topology shown in Figure 9, the following has been added to R5:
This results in a TE topology that looks like Figure 9 where the R5 to R2 link has
been colored blue
and added to the SRLG
named common-254
and the link from R5 to
R4 has been colored red
, given a te-metric
of 100
and also
added to the same SRLG.
Let’s observe the contents of the TED for R5’s link to R4 and see how the link TE information has been flooded on R1, using ISIS TE extensions, so that it can be used for path computation, and specifically for, constraint inclusion or exclusion:
user@R1> show ted database R5.00 extensive To: R4.00(1.1.1.4), Local: 10.1.45.2, Remote: 10.1.45.1 Local interface index: 335, Remoteinterface index: 334 Color: 0x800000 red Metric: 100 IGP metric: 10 Static BW: 1000Mbps Reservable BW: 1000Mbps Available BW [priority] bps: [0] 1000Mbps [1] 1000Mbps [2] 1000Mbps [3] 1000Mbps [4] 1000Mbps [5] 1000Mbps [6] 1000Mbps [7] 1000Mbps Interface Switching Capability Descriptor(1): Switching type: Packet Encoding type: Packet Maximum LSP BW [priority] bps: [0] 1000Mbps [1] 1000Mbps [2] 1000Mbps [3] 1000Mbps [4] 1000Mbps [5] 1000Mbps [6] 1000Mbps [7] 1000Mbps SRLGs: common-254 P2P Adjacency-SID: IPV4, SID: 299840, Flags: 0x30, Weight: 0 <output truncated>
Distributed SR Constraint-based SPF
In the normal SPF calculation process, a router places itself at the head of the tree with the shortest paths calculated to each of the destinations, taking only the least metric, or cost route, to the destination into account. In this calculation, a key concept to note is that no consideration is given to the bandwidth of Dynamic Path Calculation and Routing Constraints the links on the other paths. If the attributes required for a given path include parameters beyond simply the IGP cost or metric such as link color, the topology can be constrained to eliminate the links that do not allow for the mentioned requirements, such that the SPF algorithm, returns a path with both link cost and link inclusion/exclusion requirements.
With CSPF, you use more than the link cost to identify the probable
paths that can be used for TE LSP paths. The decision of which path
is chosen to set up a TE LSP path is performed at the computing entity,
after ruling out all links that do not meet a certain criteria, such
as link colors, in addition to the te-cost
of the link. The result of the CSPF calculation is an ordered set
of SIDs that map to the next-hop addresses of routers that form the
TE LSP. Therefore, multiple TE LSPs could be instantiated by using
CSPF to identify probable links in the network that meet the criteria.
Constraint-based SPF can use either administrative weights or TE metrics during the constraint-based computation. In the event of a tie, the path with the highest minimum bandwidth takes precedence, followed by the least number of hops along the path. If all else is equal, CSPF picks a path at random and chooses the same to be the TE LSP path of preference.
As previously mentioned, SR paths contain a few salient attributes, mainly MSD and ECMP attributes, resulting in the need for slightly different CSPF results than those of traditional RSVP-TE paths. As a result, Junos, as a distributed CSPF (we’ll talk about external, centralized, CSPFs in a moment), has been enhanced to not only provide an ordered set of adjacency SIDs for a path, but also to minimize or meet the MSD for a given ingress router’s MSD, as well as to leverage any available ECMPs along the resulting set of candidate paths by offering node SIDs within the segment list.
The SR-TE candidate paths will be locally computed such that they satisfy the configured routing constraints. The multi path CSPF computation results will be an ordered set of adjacency SIDs when label stack compression is disabled. When label stack compression is enabled, the result would be a set of compressed label stacks (composed of Adj-SIDs and node-SIDs) that provide IP-like ECMP forwarding behavior wherever possible.
For all computation results, we use an event-driven approach to provide updated results that are consistent with the current state of the network in a timely manner. However, we need to be careful to make sure that our computations do not become overwhelmed during those periods with large numbers of network events. Therefore the algorithm has the following properties:
Very fast reaction for a single event (e.g., link failure)
Fast-paced reaction for multiple IGP events that are temporally close, while computation and the ability to consume results are considered acceptable
Delayed reaction when computation and ability to consume results are problematic
Furthermore, reaction to certain network events varies depending on whether label stack compression is enabled or not. For the following events, there is no immediate recomputation of SID-lists of candidate paths when compression is off:
Change in TE-metric of links
Link Down events where link is not traversed by candidate path
Link Up event
When label stack compression is on, the above events are acted upon to see if the computation results are impacted.
R1: CSPF
This configuration results in the SR-TE LSPs being created as shown in Figure 11.
Verifying the SR-TE LSPs computed by the Distributed SR CSPF
user@R1# run show spring-traffic-engineering lsp detail Name: sr-te-lsp-to-r6 Tunnel-source: Static configuration To: 1.1.1.6 State: Up Path: 1st-seg-to-r6 Outgoing interface: NA Auto-translate status: Disabled Auto-translate result: N/A Compute Status:Enabled , Compute Result:success , Compute-Profile Name:red-lsp Total number of computed paths: 1 Computed-path-index: 1 BFD status: N/A BFD name: N/A computed segments count: 2 computed segment : 1 (computed-node-segment): node segment label: 104 router-id: 1.1.1.4 computed segment : 2 (computed-node-segment): node segment label: 100 router-id: 1.1.1.6 Path: 2nd-seg-to-r6 Outgoing interface: NA Auto-translate status: Disabled Auto-translate result: N/A Compute Status:Enabled , Compute Result:success , Compute-Profile Name:blue-lsp Total number of computed paths: 1 Computed-path-index: 1 BFD status: N/A BFD name: N/A computed segments count: 1 computed segment : 1 (computed-node-segment): node segment label: 100 router-id: 1.1.1.6
As you can see in the output, Junos computes a path consisting
of only the node SID for R4 to meet the constraints for the red path
and also determines that the blue path is just the shortest path and
thus only uses the node SID for R6, instead of computing a fully qualified
path of adjacency SIDs. In contrast, if you add the no-label-stack-compression
keyword you can see how a fully qualified SID list, comprised of
adjacency SIDs, is computed.
Configuration example for R1
Verifying the SR-TE LSPs computed by the Distributed SR CSPF
user@R1# run show spring-traffic-engineering lsp detail Name: sr-te-lsp-to-r6 Tunnel-source: Static configuration To: 1.1.1.6 State: Up Path: 1st-seg-to-r6 Outgoing interface: NA Auto-translate status: Disabled Auto-translate result: N/A Compute Status:Enabled , Compute Result:success , Compute-Profile Name:red-lsp Total number of computed paths: 1 Computed-path-index: 1 BFD status: N/A BFD name: N/A computed segments count: 4 computed segment : 1 (computed-adjacency-segment): label: 16 source router-id: 1.1.1.1, destination router-id: 1.1.1.3 source interface-address: 10.1.13.1, destination interface-address: 10.1.13.2 computed segment : 2 (computed-adjacency-segment): label: 18 source router-id: 1.1.1.3, destination router-id: 1.1.1.4 source interface-address: 10.1.34.1, destination interface-address: 10.1.34.2 computed segment : 3 (computed-adjacency-segment): label: 20 source router-id: 1.1.1.4, destination router-id: 1.1.1.5 source interface-address: 10.1.45.1, destination interface-address: 10.1.45.2 computed segment : 4 (computed-adjacency-segment): label: 24 source router-id: 1.1.1.5, destination router-id: 1.1.1.6 source interface-address: 10.1.56.1, destination interface-address: 10.1.56.2 Path: 2nd-seg-to-r6 Outgoing interface: NA Auto-translate status: Disabled Auto-translate result: N/A Compute Status:Enabled , Compute Result:success , Compute-Profile Name:blue-lsp Total number of computed paths: 1 Computed-path-index: 1 BFD status: N/A BFD name: N/A computed segments count: 3 computed segment : 1 (computed-adjacency-segment): label: 19 source router-id: 1.1.1.1, destination router-id: 1.1.1.2 source interface-address: 10.1.12.1, destination interface-address: 10.1.12.2 computed segment : 2 (computed-adjacency-segment): label: 21 source router-id: 1.1.1.2, destination router-id: 1.1.1.5 source interface-address: 10.1.25.1, destination interface-address: 10.1.25.2 computed segment : 3 (computed-adjacency-segment): label: 24 source router-id: 1.1.1.5, destination router-id: 1.1.1.6 source interface-address: 10.1.56.1, destination interface-address: 10.1.56.2
Lastly, as discussed above, maximum SID depth continues to play
an important role during dynamic CSPF. Like the previous PCEP example,
a maximum SID depth can be set for each specific compute-profile using
the maximum-segment-list-depth<value>
key words.
Configuration example for R1
A Centralized Controller or PCE for External CSPF
When discussing SR there is oftentimes an assumption that a controller or centralized PCE is present, so let’s briefly explore how the Northstar TE Controller can be leveraged as an external path computation source for SR paths.
BGP Link-state for Topology Discovery
In order for the controller (PCE) to do any kind of path computation, it must be synchronized with the network “topology.” A network topology can take the form of a traffic engineering database (TED), much like what RSVP-TE uses for path computation, a link-state database (LSDB), or even more physical forms. The following example shows how using BGP-LS will convey a TED to Juniper’s Northstar Controller.
Please refer to the Visualization section in Observability chapter for a BGP-LS configuration example.
PCEP for SR-TE LSP Creation and Control
The next relevant piece of information a controller requires is to be able to learn or to create a SR-TE LSP state. In next configuration example shows a Path Computation Element Protocol (PCEP) session between a Path Computation Client (PCC), or ingress router, and the controller. Figure 13 shows the PCEP session parameters.
Now let’s revisit some of the SR-specific SID types and see how they ‘look’ on the controller and how to announce them.
Anycast SIDs are advertised by the IGP SR extensions during LSDB, by flooding, creating a second set of lo0.0 addresses and announcing a prefix-SID for them. The next example announces the anycast SID 123 from nodes p1.nyc and p2.nyc.
Announcing anycast SIDs from p1.nyc and p2.nyc
Verifying on pe1.nyc
user@pe1.nyc> show isis database p2.nyc.00-00 extensive | match “1.1.2.3|123” IP prefix: 1.1.2.3/32 Metric: 0 Internal Up IP prefix: 1.1.2.3/32, Internal, Metric: default 0, Up IP extended prefix: 1.1.2.3/32 metric 0 up Prefix SID, Flags: 0x00(R:0,N:0,P:0,E:0,V:0,L:0), Algo: SPF(0), Value: 123 IP address: 1.1.2.3 user@pe1.nyc> show isis database p1.nyc.00-00 extensive | match “1.1.2.3|123” IP prefix: 1.1.2.3/32 Metric: 0 Internal Up IP prefix: 1.1.2.3/32, Internal, Metric: default 0, Up IP extended prefix: 1.1.2.3/32 metric 0 up Prefix SID, Flags: 0x00(R:0,N:0,P:0,E:0,V:0,L:0), Algo: SPF(0), Value: 123 IP address: 1.1.2.3
The anycast SID can be seen as a node property on the Northstar GUI, and later it’s chosen as a strict or loose hop (more likely a loose hop) when creating the SR-TE LSP. Binding SIDs are added to a SR-TE LSP, as shown in Figure 14, and then advertised to Northstar via the PCE Protocol.
Creating a Core SR-TE LSP with a binding SID
Verifying the core LSP on P1.NYC
user@p1.nyc# run show spring-traffic-engineering lsp detail Name: P1NYC-2-P1IAD Tunnel-source: Static configuration To: 128.49.106.9 State: Up Telemetry statistics: Sensor-name: ingress-P1NYC-2-P1IAD, Id: 3758096386 Sensor-name: transit-P1NYC-2-P1IAD, Id: 3758096387 Path: P1NYC-2-P1IAD Outgoing interface: ge-0/0/4.0 Auto-translate status: Enabled Auto-translate result: Success BFD status: N/A BFD name: N/A SR-ERO hop count: 2 Hop 1 (Strict): NAI: IPv4 Adjacency ID, 0.0.0.0 -> 192.0.2.15 SID type: 20-bit label, Value: 94 Hop 2 (Strict): NAI: IPv4 Adjacency ID, 0.0.0.0 -> 192.0.2.26 SID type: 20-bit label, Value: 51 Total displayed LSPs: 1 (Up: 1, Down: 0)
Verifying the routing table on p1.nyc
user@p1.nyc# run show route protocol spring-te inet.0: 49 destinations, 60 routes (46 active, 0 holddown, 3 hidden) inet.3: 11 destinations, 15 routes (11 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 128.49.106.9/32 *[SPRING-TE/8] 00:02:45, metric 1, metric2 20 > to 192.0.2.21 via ge-0/0/5.0, Push 1009 to 192.0.2.13 via ge-0/0/1.0, Push 1009, Push 1008(top) iso.0: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden) mpls.0: 62 destinations, 62 routes (62 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 1000001 *[SPRING-TE/8] 00:02:45, metric 1, metric2 20 > to 192.0.2.21 via ge-0/0/5.0, Swap 1009 to 192.0.2.13 via ge-0/0/1.0, Swap 1009, Push 1008(top)
Verifying binding SIDs on the Northstar TE Controller
Binding SIDs can be verified using LSP properties or by adding a binding SID column to the Tunnels tab.
Now that you have explored various aspects of TE, several of the SR-specific SID types, and how basic information is synchronized with a controller, let’s revisit the sample network from previous chapters and bring an entire solution together!
End-to-End TE Solution
One of the key goals of transitioning the sample network to segment routing will be to replicate a form of “bandwidth optimization” (TE) into the core, Level 2 ISIS domain, where currently the Auto Bandwidth RSVP-TE LSPs provide a relatively granular per path bandwidth optimization. Because SR does not maintain the per LSP state, and thus per LSP statistics, a bandwidth optimization solution for a SR network requires that a controller acquire data from another source, such as streaming telemetry sources as described in Observability chapter, to end at a solution in another form that RSVP-TE would have.
First let’s start by creating a mesh of SR-TE LSPs (see Table 1) between pe1.nyc and all the PEs in the IAD PoP. These LSPs will be created, ephemerally, using the PCE protocol such that the Northstar Controller has explicit control of each of their paths.
Table 1: SR-TE LSP Mesh
LSP Name | Ingress router | Egress router |
---|---|---|
pe1.nyc-pe1.iad | pe1.nyc | pe1.iad |
pe1.nyc-pe2.iad | pe1.nyc | pe2.iad |
pe1.nyc-pe3.iad | pe1.nyc | pe3.iad |
pe1.iad-pe1.nyc | pe1.iad | pe1.nyc |
pe2.iad-pe1.nyc | pe2.iad | pe1.nyc |
pe3.iad-pe1.nyc | pe3.iad | pe1.nyc |
To create a SR-TE LSP on the Northstar Controller use the Applications>Provision LSP drop-down option or the Add button on the Tunnel Tab. A pop-up window will appear, adding the attributes of the LSP, as shown in Figure 16. A key attribute when creating SR-TE LSPs is to provide them with some nominal ‘Planned Bandwidth’ value. This is to ensure that during periodic reoptimization, or a triggered reoptimization, link congestion awareness can be accounted for by Northstar’s CSPF, as you will see later.
At the time of this writing, per SR policy ingress statistics were not available to the Northstar Controller. In the future, the static 10k bandwidth assigned to each SR-TE LSP can be replaced with a dynamic, real-time data plane statistics value so that a more granular bandwidth optimization is realizable.
The resulting SR-TE LSP mesh is shown in Figure 17. Here you can see what links are traversed by the mesh.
From the ingress PCC perspective, you can see that three SR-TE LSPs have been signaled via the Controller (PCE):
user@pe1.nyc> show path-computation-client lsp Name Status PLSP-Id LSP-Type Controller Path-Setup-Type pe1.nyc-pe1.iad Primary(Act) 10 ext-provised NS1 spring-te pe1.nyc-pe2.iad Primary(Act) 11 ext-provised NS1 spring-te pe1.nyc-pe3.iad Primary(Act) 12 ext-provised NS1 spring-te
And that each has been installed in the inet.3 RIB of pe1.nyc:
user@pe1.nyc# run show route protocol spring-te table inet.3 inet.3: 8 destinations, 12 routes (8 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 128.49.106.10/32 *[SPRING-TE/8] 00:06:23, metric 1, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 18, Push 18, Push 22(top) 128.49.106.11/32 [SPRING-TE/8] 1d 00:38:38, metric 1, metric2 0 > to 192.0.2.7 via ge-0/0/2.0, Push 28, Push 20, Push 18(top) 128.49.106.13/32 *[SPRING-TE/8] 00:05:30, metric 1, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 20, Push 20, Push 26(top)
You can verify the label stack for each SR-TE LSP via the Northstar Controller GUI, shown in the Figure 18, as shown in the Record Route and the ERO columns, and displaying the adjacency SIDs for the topology.
As you can see from the Junos ingress router’s SR-TE LSP detailed output, the label stack matches. It’s worth noting that the first label (17, in the case of SR-TE LSP pe1.nyc-pe1.iad) is not actually imposed on the packet since the IP address from the PCEP NAI (node or adjacency identifier) field is used for the output interface selection. This is illustrated in the next output.
Detailed SR-TE LSP output:
user@pe1.nyc> show spring-traffic-engineering lsp detail Name: pe1.nyc-pe1.iad Tunnel-source: Path computation element protocol(PCEP) To: 128.49.106.11 State: Up Telemetry statistics: Sensor-name: ingress-pe1.nyc-pe1.iad, Id: 3758096391 Outgoing interface: NA Auto-translate status: Disabled Auto-translate result: N/A BFD status: N/A BFD name: N/A SR-ERO hop count: 4 Hop 1 (Strict): NAI: IPv4 Adjacency ID, 192.0.2.6 -> 192.0.2.7 SID type: 20-bit label, Value: 17 Hop 2 (Strict): NAI: IPv4 Adjacency ID, 192.0.2.22 -> 192.0.2.23 SID type: 20-bit label, Value: 18 Hop 3 (Strict): NAI: IPv4 Adjacency ID, 192.0.2.33 -> 192.0.2.32 SID type: 20-bit label, Value: 20 Hop 4 (Strict): NAI: IPv4 Adjacency ID, 192.0.2.39 -> 192.0.2.38 SID type: 20-bit label, Value: 28
JUNOS inet.3 RIB entry for the SR-TE LSPs:
user@pe1.nyc# run show route protocol spring-te table inet.3 inet.3: 8 destinations, 12 routes (8 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 128.49.106.10/32 *[SPRING-TE/8] 00:06:23, metric 1, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 18, Push 18, Push 22(top) 128.49.106.11/32 [SPRING-TE/8] 1d 00:38:38, metric 1, metric2 0 > to 192.0.2.7 via ge-0/0/2.0, Push 28, Push 20, Push 18(top) 128.49.106.13/32 *[SPRING-TE/8] 00:05:30, metric 1, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 20, Push 20, Push 26(top)
Our sample network is providing several services to attached CE routers. From ce1.nyc you can see the resulting label stack of the transport SR-TE LSPs between pe1.nyc and pe2.iad:
user@ce1.nyc> traceroute 198.51.100.60 traceroute to 198.51.100.60 (198.51.100.60), 30 hops max, 40 byte packets 1 pe1.nyc-ge-0-0-10.1 (198.51.100.1) 3.008 ms 1.918 ms 1.972 ms 2 p1.nyc-ge-0-0-2.0 (192.0.2.5) 12.785 ms 9.685 ms 24.882 ms MPLS Label=22 CoS=0 TTL=1 S=0 MPLS Label=18 CoS=0 TTL=1 S=0 MPLS Label=18 CoS=0 TTL=1 S=1 3 p1.ewr-ge-0-0-2.0 (192.0.2.15) 8.927 ms 14.411 ms 8.648 ms MPLS Label=18 CoS=0 TTL=1 S=0 MPLS Label=18 CoS=0 TTL=2 S=1 4 p1.iad-ge-0-0-6.0 (192.0.2.26) 9.039 ms 8.564 ms 10.233 ms MPLS Label=18 CoS=0 TTL=1 S=1 5 pe2.iad-ge-0-0-1.0 (192.0.2.40) 7.857 ms 7.481 ms 17.570 ms 6 ce1.iad-ge-0-0-7.0 (198.51.100.60) 8.767 ms 13.242 ms 15.533 ms user@ce1.nyc> ping 198.51.100.54 rapid count 100000000 size 500 PING 198.51.100.54 (198.51.100.54): 500 data bytes !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [output truncated]
Now let’s get back to how to provide a bandwidth optimization service for IS-IS Level 2 domain using the Northstar Controller. The original core network was built on RSVP-TE Auto Bandwidth LSPs. These dynamically adapt to increasing and decreasing traffic rates. Segment routing currently has no equivalent capabilities, primarily due to a lack of transit LSP state. We will enable an attribute on the Northstar Controller to react to interface congestion to trigger SR-TE LSP re-optimization based solely on ingress statistic collection. To enable this feature of the Northstar Controller go to Administration>Analytics and toggle the Reroute feature On as shown in Figure 20.
To ensure the topology also displays real-time interface statistics, use the drop down box on the left hand side of the GUI. Select Performance and enable Interface Utilization, as shown in Figure 21.
To simulate traffic on the network, in order to illustrate the controller’s ability to reroute SR-TE LSPs away from congested links, let’s start some extended pings with large packet sizes between pe1.nyc and p1.iad:
user@pe1.nyc> traceroute 192.0.2.26 traceroute to 192.0.2.26 (192.0.2.26), 30 hops max, 40 byte packets 1 p1.nyc-ge-0-0-2.0 (192.0.2.5) 3.569 ms 2.412 ms 3.261 ms 2 p1.ewr-ge-0-0-2.0 (192.0.2.15) 6.769 ms 3.793 ms 3.152 ms 3 p1.iad-ge-0-0-6.0 (192.0.2.26) 6.040 ms 5.814 ms 5.074 ms user@pe1.nyc> ping rapid 192.0.2.26 count 100000000 size 500 PING 192.0.2.26 (192.0.2.26): 500 data bytes !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [output truncated]
And likewise in the other direction…
user@p1.iad> traceroute 128.49.106.1 traceroute to 128.49.106.1 (128.49.106.1), 30 hops max, 40 byte packets 1 p1.phl-ge-0-0-3.0 (192.0.2.31) 2.637 ms 2.077 ms 2.322 ms 2 p1.nyc-ge-0-0-5.0 (192.0.2.20) 3.493 ms 5.646 ms 3.116 ms 3 pe1.nyc-lo0.0 (128.49.106.1) 7.056 ms 5.973 ms 7.435 ms user@p1.iad> ping 128.49.106.1 rapid count 10000000 size 1000 PING 128.49.106.1 (128.49.106.1): 1000 data bytes !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [output truncated]
As you can see in Figure 22, Northstar detects the link congestion which will trigger a path reoptimization and reroute the SR-TE LSPs away from the congested link(s).
By selecting Timeline in the left panel, you can see in Figure 23 that link congestion, based on the interface congestion threshold, has been detected and the LSPs that were traversing the congested link have been scheduled for rerouting.
Going back to Northstar’s Topology > Tunnel View, you can see the SR-TE LSP between pe1.nyc and pe2. iad in Figure 24. You can see that the LSP is now on a new path avoiding the congested link(s).
And the ingress router’s SR-TE LSP has been updated as well:
user@pe1.nyc> show route table inet.3 protocol spring-te inet.3: 8 destinations, 12 routes (8 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 128.49.106.10/32 *[SPRING-TE/8] 00:00:07, metric 5, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 18, Push 20, Push 26(top) 128.49.106.11/32 [SPRING-TE/8] 00:00:07, metric 5, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 28, Push 18, Push 22(top) 128.49.106.13/32 *[SPRING-TE/8] 00:00:07, metric 5, metric2 0 > to 192.0.2.5 via ge-0/0/1.0, Push 20, Push 18, Push 22(top)
Traffic engineering is an indispensable function in most backbone wide area networks. A key objective of modern TE is the optimization of resource utilization. RSVP-TE has a long history and a large box of tools to leverage while SR needs to leverage newer tools, such as streaming telemetry and controllers, to achieve better resource utilization. This chapter provided a number of SR-specific options for creating explicit paths, various forms of distributed and centralized path computation, and finally, how our sample network can be transitioned to a SR TE. While the bandwidth optimization solution differs quite a bit from traditional RSVP-TE, it can provide a means of reacting to interface congestion to approximate resource optimization.