Application Quality of Experience
Application Quality of Experience (AppQoE)
- Introduction to Application Quality of Experience
- Benefits of Application Quality of Experience
- Limitations
- How Application Quality of Experience Works?
- How Application Quality of Experience Measures Application Performance
- Switching Application Traffic to An Alternate Path
Introduction to Application Quality of Experience
The relentless growth of cloud computing, mobility, and Web-based applications, requires that the network identify and control the traffic at the application level, and handle each application type separately to provide quality of experience (QoE) for users. To ensure application-specific QoE (AppQoE), you need to effectively prioritize, segregate, and route application traffic without compromising performance or availability.
AppQoE utilizes (or employs) the capabilities of two application security services - application identification (AppID) and advanced policy-based routing (APBR). It uses AppID to identify specific applications in your network and advanced policy-based routing (APBR) to specify a path for certain traffic by associating SLA profiles to a routing instance on which the application traffic is sent as per APBR rules.
One of the important requirements of a software-defined WAN (SD-WAN) is to measure the quality of underlay network paths and, based on the results, determine the best paths to use for the delivery of each packet. AppQoE monitors the performance of business- critical applications, and based on the score, selects the best possible link for that application traffic in order to meet performance requirements specified as in SLA (service-level agreement).
The presence of an SLA rule in the APBR configuration triggers the AppQoE functionality; If there are no SLA profiles available, the APBR functions without triggering AppQoE.
Supported Use Case
You can configure AppQoE optimally using Contrail Service Orchestration (CSO). We recommend that you use CSO to configure AppQoE for Juniper Networks Contrail SD-WAN solution. For more details, see Application Quality of Experience Overview and Configure and Monitor Application Quality of Experience.
Supported SRX Series Firewalls
AppQoE is supported on both hub-and-spoke and full mesh topologies in SD-WAN deployments.
You can configure vSRX Virtual Firewall instances, SRX300 line devices, SRX550M as spoke devices and SRX1500, SRX4100 and SRX4200 as hub devices.
You can configure an AppQoE between two SRX Series Firewall endpoints (book-ended) and both SRX Series Firewalls must have the same version of the Junos OS image.
Starting in Junos OS Release 15.1X49-D160 and in Junos OS 19.1R1, SRX4100 and SRX4200 devices support AppQoE when the devices are in chassis cluster mode. You can configure the device to operate both in active/active and in active/passive modes and deploy the device as spoke device in SD-WAN deployments.
Benefits of Application Quality of Experience
Enables cost-effective QoE by providing real-time monitoring of application traffic to provide a consistent and predictable level of service.
Increases customer retention and satisfaction by providing a guaranteed SLA for the delivery of the certain traffic (such as video traffic). AppQoE ensures that the approved traffic receives the appropriate priority, and bandwidth required to ensure the best quality of experience to the user.
Limitations
Implementation of AppQoE on security devices has the following limitations:
All the different routes to the destination through different interfaces must have the same preference, weight, and metrics configured. All routes must be added as ECMP paths for the destination and must also be part of the same forwarding table.
AppQoE SLA service only between two security devices endpoints (book-ended) are supported. End-to-end AppQoE SLA service is not supported.
AppQoE can be applied only if all interfaces are part of the same zone.
AppQoE cannot be applied for reverse traffic.
AppQoE does not influence in change in the destination for a session.
AppQoE does not support IPv6/UDP probe encapsulation, GRES, chassis cluster (ISSU, high-availability, dual CPE high availability, Z-mode high availability), and logical systems.
AppQoE does not support preferred path selection and transit virtual routing and forwarding (VRF) are not supported.
AppQoE does not support passive probing on IPv6 data packets.
An input firewall filter is required at the non-WAN interfaces to discard UDP packets with UDP destination port 36000.
The SRX4600 device has the following limitations:
The class of service (CoS) for generic routing encapsulation (GRE) is not supported when AppQoE is configured.
Passive probe details might not be available for the each short-lived session.
Synchronization of the session states might not happen on secondary node in Z-line mode traffic processing when device is operating in chassis cluster mode.
How Application Quality of Experience Works?
AppQoE utilizes AppID and APBR capabilities to identify specific applications/application groups and specify a path for certain traffic by associating SLA profiles to a routing instance on which the application traffic is sent as per APBR rules.
AppQoE monitors the performance of applications, and based on the score, selects the best possible link for that application traffic in order to meet performance requirements specified as in SLA (service-level agreement).
- Identifying Applications or Application Groups
- Specifying Path for Applications or Application Groups
- Application Traffic Path Selection
Identifying Applications or Application Groups
Following steps are involved in identifying applications or application groups:
Junos OS application identification identifies applications and once an application is identified, its information is saved in the application system cache (ASC).
APBR evaluates the packets based to determine if the session is candidate for application-based routing (advance policy-based routing). If this is first packet of the new session and traffic is not flagged for application-based routing, it undergoes normal processing (non-APBR route) to destination.
If the session needs application-based routing, APBR queries the ASC module to get the application attributes (IP address, destination port, protocol type, and service).
-
If the application in ASC is found, traffic is further processed for a matching rule in the APBR profile.
-
If a matching rule is found, the traffic is redirected to the specified routing instance for the route lookup.
-
AppQoE checks whether an SLA is enabled for a session. If the session is a candidate for an SLA measurement, AppQoE initiates active and passive probes for performance measurements.
-
If SLA is not enabled for the session in the APBR rule, the AppQoE ignores that session and the default behavior of APBR is applied to those sessions—that is, traffic is routed through the specified routing instance for the destination.
-
If the application in is not found in ASC, APBR requests for deep inspection of the flow. that is, application signature package is installed and application identification for the session is enabled, so that ASC can be populated for use by subsequent sessions for APBR processing (see step 2).
-
Specifying Path for Applications or Application Groups
The following steps summarize how AppQoE specifies a path for the application traffic according to the SLA rules.
APBR uses the application details to look for a matching rule in the APBR profile (application profile). Traffic matching the applications and application groups, are forwarded to the static route and the next-hop address as specified in the routing instance.
An SLA rule attached to the APBR profile specifies parameters, that are required to measure the SLA and to identify whether any SLA violation has occurred or not.
The applications traffic is assigned to a particular overlay link based on the SLA metrics of that overlay link measured using active probing.
The SLA violation is determined through passive probing of live application/application group traffic. The best path/overlay link for the application/application group is determined through the path selection algorithm.
Application Traffic Path Selection
The following steps take place for routing data traffic from source to destination, specifically, to select the best path,
For the first data packet of a flow (first path), if the application is already known (from the ASC lookup), then the best path for the application is searched in the database. If the application is not known or is new (from ASC lookup), then a random path or the default path is chosen. This path continues for the entire session. Later, after the application is detected by the DPI, the database is updated with the best path for the application.
For the remaining data packet of a flow (fast path), if the application is not known initially, then the particular session continues on the same path. If the application is known initially, then AppQoE selects the best path for the application traffic.
When a new application is detected, the path selection mechanism attempts to find a path that satisfies all the SLA metrics. If no such path exists, then the next best path (based on number of metrics satisfied) is used. If there are more than one path that satisfies the metrics, a random path among the available paths is selected. The SLA violation is detected when any one of the metric is violated or none of the metrics meets the requirement, based on the profile configuration.
How Application Quality of Experience Measures Application Performance
Application performance is determined by the following indicators:
Latency—The amount of time physically required for media to travel depending on media length and distance that need to be covered
RTT— A round-trip time required to travel from source to destination and vice versa.
Packet loss—Packet loss reflects the number of packets lost per 100 of packets sent by a host.
Jitter—Jitter is the difference in the latency from packet to packet. Ingress jitter, egress jitter, and two-way jitter can be specified for evaluating the performance of the link.
AppQoE monitors RTT, jitter, and packet loss on each link, and based on the score, seamlessly diverts applications to the alternate path if performance of the primary link is below acceptable levels as specified by SLA. Measurement and monitoring of application performance is done using active and passive probes to detect SLA violations and to select an alternate path for that particular application.
AppQoE collects real-time data by continuously monitoring application traffic and identifying network or device issues by:
Monitoring the performance on all configured overlay links.
Using passive probes (inline with the application datapath) and active probes (synthetic probes for specific application) to monitor the traffic performance for application or application group.
Sending all collected performance metrics or metadata for analysis to a log collector.
Comparing specified application against a specific performance metric and changing the path for the application traffic dynamically in case of an SLA violation.
Supporting flexible SLA metric configuration for a given application or application group.
AppQoE measures the application SLA across multiple WAN links, and maps the application traffic to a path among the available links, that is, to the path that best serves the SLA requirement.
Application Performance Measurement by Using Active and Passive Probes
Active and passive probe measurements are the two approaches used for end-to-end analysis of the network.
Active probe—Active probes measure the service quality of the application to provide an end-to-end measurement of the network performance.
In active probing, custom packets are sent between spoke and hub points on all the multiple routes and the RTT, latency, jitter, and packet-loss are measured between the installed probe points. The active probes are sent periodically on all the active and passive links. A configured number of samples is collected and a running average for each such application’s probe path is measured. If there is a violation detected for any application traffic, the probe metrics are evaluated to determine the best link that satisfies the SLA.
Passive probe—Passive probes are installed on links within the network, and they monitor all the traffic that flows through those links.
Passive probing monitors links for SLA violations on live data traffic. In a passive probe, the actual data packets are encapsulated in an IP/UDP probe header in the live traffic between the SRX Series book-ended points, and RTT, jitter and packet loss between the points of installation of the probes are measured to compute the service quality.
If there is a violation detected for any application, the synthetic probe metrics are evaluated to determine the best link that satisfies the SLA.
Note:Starting in Junos OS Release 18.3R1 and in Junos OS Release 15.1X49-D150, on all supported SRX Series Firewalls and vSRX Virtual Firewall instances, in order to detect if a link or path is down by passive probes, a minimum of three probe requests and 100% packet loss must occur in a sampling period for a given session to trigger SLA violation.
Note:When the device is operating in chassis cluster mode, if the secondary node (node 1), through which traffic is forwarded, is rebooted, multiple switching of the application traffic between the links across secondary node links occurs. This happens when the available links on primary node(node 0) are having less active probe SLA path score compared to the secondary node links. This behavior continues until AppQoE active probe SLA path score results are available to indicate that there is 100% packet loss on all the links on secondary node.
You can configure an SLA rule with active and passive probe parameters and associate the SLA rule with APBR profile. The APBR profile also includes a APBR rule. Rules are associated with one or more than one application or application groups and the traffic matching the rule is redirected to the routing instance
AppQoE triggers the probe requests to all probe paths of the application. Active and passive probes monitor the network for areas or points of failures or congestion.
AppQoE collects traffic class statistics for learned applications using active and passive probes and takes following actions:
Measure performance for SLA—The real-time metrics provided by probes are used to score service quality according to the SLA for an application and determine whether the application path does not meet SLA requirements. That is, if there is a violation detected for any application, the synthetic probe metrics are evaluated to determine the best alternate link for the application traffic that satisfies the SLA.
Reroute traffic—Switch the application traffic between the two links, that is, when one link has performance issues, the traffic is routed to the other link during the same session.
If the application’s traffic can be reachable through multiple links, you must configure all the reachable paths as overlay paths and attach the overlay paths to application’s SLA rule.
Switching Application Traffic to An Alternate Path
You can enable or disable switching of the application traffic to another route (local to the device) during an SLA violation. When local route switching is enabled, switching of the application traffic to an alternate route is enabled and the SLA monitoring and reporting functionality is also available. Even when the option for switching of the application traffic to an alternate path is disabled in the SLA rule configuration, AppQoE resolves SLA violations---for example, by switching the application traffic to a new path
When local route switching is disabled, only SLA monitoring and reporting functionality is available and switching of the application traffic to the different route because of an SLA violation is tuned off.
When an application traffic switches to an alternative path, there will be a short time period during which the application traffic cannot be switched again to another path in case of SLA violation. This time period helps to avoid flapping of the traffic across links.
Understanding AppQoE Configuration Limits
Starting in Junos OS Release 15.1X49-D160 and in Junos OS Release 19.1R1, AppQoE enforces the configuration limit for overlay paths, metric profiles, probe parameters, and SLA rules per profile when you configure application-specific SLA rules and associates the SLA rules to an APBR profile.
If you configure the parameters more than the allowed limit, error messages are displayed when you commit the configuration.
Examples of error messages:
The following sample error messages are from the SRX4100 and SRX4200 device. The value of the configuration limit might not reflect exact number supported; the numbers might differ between the supported devices
[edit security advance-policy-based-routing] 'sla-rule sla0' Cannot configure more than 32 sla rules error: configuration check-out failed
[edit security advance-policy-based-routing] 'overlay-path grep2' Cannot configure more than 2000 overlay paths error: configuration check-out failed
[edit security advance-policy-based-routing] 'metrics-profile m0' Max metrics for this system is 32 error: configuration check-out failed
[edit security advance-policy-based-routing] 'active-probe-params pr0' Cannot configure more than 64 probe params error: configuration check-out failed
Application Path Selection Based on Link Preference and Priority
One of the important requirements of a software-defined WAN (SD-WAN) is to measure the quality of underlay network paths and, based on the results, determine the best paths to use for the delivery of each packet.
Starting in Junos OS Release 18.4R1 and in Junos OS Release 15.1X49-D160, you can configure application-specific quality of experience (AppQoE) to select the application path based on the link priority and the link type when multiple paths that meet the SLA requirements are available.
You can select an MPLS or Internet link as the preferred path, assign the priority between 1 through 255 with a lower value indicating a more preferred link. A value of one (1) indicates highest priority. If there are multiple paths available, the path which has the highest priority is selected.
For example, If an MPLS path is selected for VoIP traffic and quality degradation occurs during a call because of jitter or packet loss, the packets are sent through another path (Internet) that meets SLA requirements. Now application traffic is sent through the Internet path and if the quality in the Internet path is degraded, the path is switched back to MPLS.
You can configure the link priority and link type of each underlay interface in an advanced policy-based routing (APBR) rule, and the same parameters are inherited by the corresponding overlay. An underlay interface in this case is the final outgoing interface in the routing topology for the overlay.
For example, in a network infrastructure, if the underlay is a fourth-generation (4G) LTE connection, then the dialer interface can be configured as the underlay interface for AppQoE. Similarly, if the underlay is a DSL connection, then the corresponding Point-to-Point Protocol over Ethernet (PPPoE) interface can be configured as the underlay interface for AppQoE.
Starting in Junos OS Release 21.2R1, the AppQoE path selection mechanism is enhanced with custom link tag configuration, application traffic switch to the higher priority link of the preferred tags, non-SLA metrics based deployment, and overlay interface attribute preference features.
Benefits of Application Path Preference and Priority
-
Provides flexibility of selecting the best path for for application traffic.
-
Enables routing of application traffic over the cost-effective connectivity option while ensuring SLA requirements (latency and jitter) are met.
-
Supports dynamic path switching if the selected application path experiences a degradation in quality.
Path Selection Mechanism
Application traffic is routed through separate links based on the link preference as following:
-
AppQoE path selection mechanism includes a list of best paths to a specific destination that meets the SLA requirements. From this list, AppQoE selects a path that matches the link preference configured by the user.
-
If there are multiple such paths, the path that has the highest priority among them is selected.
-
If there is no priority or link type preference configured, then a random path or the default path is selected.
-
If no links that meet the SLA requirements are available, then the best available link in terms of the highest SLA score and link type preference, in case strict affinity is configured, is selected.
-
If multiple links that meet the SLA requirements are available, then the one with the highest priority is selected.
System Log Messages for AppQoE
Starting in Junos OS Release 19.2R1, the support for the application-level logging is available for AppQoE on SRX Series Firewalls. This feature is introduced to reduce the impact on CSO or log collector device while processing large number of system log messages generated at the session-level. The security device maintains session-level information and provides system log messages for the session level. With application-level logging replacing session-level logging, the overhead on security device decreases and AppQoE log throughput increases.
AppQoE sends following system log messages:
APPQOE_SLA_METRIC_VIOLATION: When a violation is detected for a session and when a session’s path is resolved as a result of moving to a new link.
APPQOE_BEST_PATH_SELECTED: When a session switches the path for its data traffic.
With application-level logging, all session-level logs are supported at the application-level. The AppQoE functionality of sending real-time probes, measuring the SLA metrics, violation detection, and path-switch continues at the session-level. However, as part of application-level summarization feature, datapath sessions notify the SLA metrics, violation information, and path switch to AppQoE database. The information thus received from datapath is aggregated at the application-level, and then sent in the form of system logs to collector device.
Table 1 provides details of new application-level logs are supported from Junos OS Release 19.2R1 onwards.
system log Message |
Description |
---|---|
APPQOE_APP_SLA_METRIC_VIOLATION |
|
APPQOE_APP_BEST_PATH_SELECTED |
|
APPQOE_APP_PASSIVE_SLA_METRIC_REPORT |
|
Application-level logging introduces the following AppQoE functionality changes:
Active probe maintains and uses only real-time RTT and jitter values. For packet loss, it refers the previous session’s cause because packet loss can be calculated only at the end of the window.
During configuration commit, active probe sets RTT and jitter values to highest 32-bit value for all entries.
Active probe retains previous session’s values until the a proper real-time value of the metrics are available.
When a 100% packet loss is experienced in active probing, all other metrics are set to highest 32-bit value.
Reporting of Invalid Values for RTT and Jitter
When the data for RTT and Jitter is not available, log messages sent with an invalid value of 0xFFFFFFFF and it can be ignored by the log collector. Table 2 provides some possible scenarios when the invalid RTT and Jitter is sent.
Scenario |
Affected System Logs |
---|---|
100% packet loss: |
APPQOE_APP_PASSIVE_SLA_METRIC_REPORT APPQOE_APP_SLA_METRIC_VIOLATION |
Packet-loss greater than 0 and less than 100%: |
APPQOE_APP_PASSIVE_SLA_METRIC_REPORT APPQOE_APP_SLA_METRIC_VIOLATION |
No Packet-loss |
APPQOE_APP_SLA_METRIC_VIOLATION APPQOE_APP_PASSIVE_SLA_METRIC_REPORT |
Disable AppQoE Logging
By default AppQoE log-type is set as system log. If you want to disable AppQoE, then configure the log-type as disabled in the following configuration:
Application Quality of Experience (AppQoE) Based on the DSCP Bits of Incoming Traffic
To overcome this scenario, Starting in Junos OS Release 19.4R1, AppQoE supports SLA-based path selection for the incoming traffic on the basis of DSCP value. AppQoE selects the best possible link for the application traffic based on the application signature or DSCP value or combination of both application identification and DSCP value. See
DSCP Support in APBR
When you configure both DSCP and dynamic application in a APBR rule, the rule is considered as match if the traffic matches all the criteria specified in the rule. When there are multiple DSCP values present in the APBR rule, then if any one criteria matches, it is considered as match.
A APBR profile can contain multiple rules, each rule with a variety of match conditions.
In case of multiple APBR rules in a APBR profile, the rule lookup uses the following priority order:
Rule with DSCP + dynamic application
Rule with dynamic application
Rule with DSCP value
Network Service Orchestrator can map application to DSCP value at external service function and the same is provisioned at the gateway router to map the DSCP to desired SLA profile.
Figure 1 shows a scenario where AppQoE performs SLA-based path selection for the incoming traffic on the basis of DSCP value and application signature in a gateway router use case.
For the traffic based on the DSCP value, AppQoE works as follows:
-
All the traffic entering the gateway router from LAN undergoes application identification. Until DPI identifies an application, the system forwards the traffic stream to a default forwarding virtual routing and forwarding (VRF) instance. VRF includes an outgoing interface associated to it.
-
Junos OS application identification identifies applications and once an application is identified, its information is saved in the application system cache (ASC).
-
The system continues to check if any application information available either from DPI classification or ASC.
-
The APBR mechanism classifies sessions based on well-known applications signatures and DSCP values and uses policy to identify the best possible route for the application. The APBR policy maps application traffic to a specific VRF.
-
The presence of an SLA rule in the APBR configuration triggers the AppQoE functionality; AppQoE performs SLA-based path selection for the traffic based on the application or DSCP value.
A single DSCP includes multiple application categories bundled into it. Different application categories have their individual traffic pattern. In such a scenario, detection of violation using passive probes and applying it to all the sessions might cause false negative and false positive. As a workaround, avoid using passive probing when you have configured DSCP-based SLA rule. You can use active probes for the destination path group to which the traffic is forwarded.
Limitations
AppQoE deployments with DSCP-based rules on the device is chassis cluster mode have the following limitations:
-
If the rule match is completed before the application identification is done, and AppQoE moves the session to the other node, then application identification does not complete. This condition occurs when the DSCP-based rule is configured.
-
If you have configured two APBR rules—1) with DSCP value 2) with both DSCP and dynamic application, and assigned a same DSCP value in both the rules, on receiving the first packet, APBR matches with the DSCP rule. In case the best path is identified on the other node, then the session is moved to the other node. In this scenario, the application sessions are matched against the DSCP rule and not with the APP+DSCP rule.
APBR Policies for AppQoE
Starting in Junos OS Release 20.1R1, AppQoE utilizes the granular rule matching functionality provided by APBR to provide the quality of experience (QoE) based on the application traffic.
In Junos OS Release 18.2R1, APBR supported configuring policies by defining source addresses, destination addresses, and applications as match conditions. After a successful match, the configured APBR profile is applied as an application services for the session. In Junos OS Release 20.1R1, AppQoE leverages the APBR enhancement and selects the best possible link for the application traffic as sent by APBR to meet performance requirements specified in SLA.
For example, you want to forward Telnet and HTTPS traffic arriving at the trust zone to a specific device or interface through a best available link. When traffic arrives at the trust zone, APBR matches the traffic with matching criteria source address, destination address and applications defined in APBR policies. If traffic matches the policy, corresponding APBR profiles are applied.
APBR uses the application details to look for a matching rule in the profile. If a matching rule is found, the traffic is redirected to the specified routing instance as defined in the rule.
AppQoE Multi-homing with Active-Active Deployment
Starting In Junos OS Release 20.2R1, AppQoE is enhanced to support multi-homing with active-active deployment. Previously, AppQoE supported multihoming with active-standby deployment.
In active-active deployment, the spoke device connects to multiple hub devices. Application traffic can transit through any of the hub devices if the link to the hub device meets SLA requirements. Application traffic can seamlessly switch between the hub devices in case of SLA violation or the active hub device is not responding
Figure 1 shows a mesh topology. In this topology, an end point is reachable through more than one node.
To enable multihoming in active-active mode, you must configure the BGP multipath to allow the device to select multiple equal-cost BGP paths to reach a given destination.
When you enable BGP multipath, the device selects multiple equal-cost BGP paths to reach a given destination, and all these paths are installed in the forwarding table. AppQoE completes the route lookup and gets the next-hop route details along with the corresponding overlay-links. AppQoE obtains the overlay-link property from the configured destination path group.
Based on the application’s SLA requirements and link preferences, AppQoE determines the best link among all the links in that destination-path-group. In case of SLA violation, based on the SLA score and link preferences, AppQoE selects alternate links across all the configured destination-path-group if the end-point is reachable through those links.
For more information on BGP multipath configuration, see Examples: Configuring BGP Multipath.
Limitation
In certain scenario when next-hop ID for the route changes, the existing sessions remain on the SLA-violated link even though another link that meets SLA requirements is available. However, the new sessions are not impacted in this case and they are routed through the links that meet SLA requirements.
Support for SaaS Applications
Starting in Junos OS Release 20.4R1, we’ve extended application quality of experience (AppQoE) support for Software as a Service (SaaS) applications.
AppQoE performs service-level agreement (SLA) measurements across the available WAN links such as underlay, GRE, IPsec or MPLS over GRE. It then sends SaaS application data over the most SLA-compliant link to provide a consistent service.
Support for IPv6 Traffic
-
Starting in Junos OS Release 21.3R1, you can use IPv6 addresses in AppQoE configurations. The support includes:
- IPv6 address in overlay path configuration
- Active probing sessions using IPv6 addresses as source and destination address.
- IPv4 and IPv6 traffic from the client side
- Dual stacking of IPv4 and IPv6 on the LAN side
- IPv6 address on the LAN side for SaaS (software as a service)
probing.
For SaaS probing, ensure that you configure both IPv4 and IPv6 addresses for the incoming interface for IPv4 and IPv6 interoperability.
- Starting in Junos OS Release 21.4R1, you can use dual stacking of IPv4 and IPv6 for overlay and underlay networks in an AppQoE configuration.
Change History Table
Feature support is determined by the platform and release you are using. Use Feature Explorer to determine if a feature is supported on your platform.