Known Issues

This section lists the known issues in Juniper Paragon Automation.

Device Life-Cycle Management

If the device onboarding fails, the device onboarding status is displayed as Status is not available in the Devices section of the Network Implementation Plan page (Inventory > Device Onboarding > Network Implementation Plan).

Workaround: For such devices, initiate the outbound SSH connection on the router so that the onboarding workflow restarts.
In this release, single node failure and restart might cause inconsistencies in device inventory.

Workaround: None.
Sometimes, the onboarding workflow might restart for a device that is already onboarded. This is harmless, as the workflow will observe that the node is already onboarded, report the observation, and exit.

Workaround: None.
Paragon Automation triggers the configuration templates included in a device profiles and interface profile only during the initial onboarding of the device. You cannot use the configuration templates included in the device profiles and interface profiles to apply additional configuration on a device after the device is onboarded.

Workaround: If you need to apply additional configuration on a device after the device is onboarded, you need to manually apply the configuration.
If you have enabled the Trust option in the device profile, the onboarding workflow may fail with the following error:

Onboarding workflow failed reason: task_failure, Trust score computation failed.

Workaround: Retry the onboarding process after a few minutes.

Airflow scheduler pod may crash when multiple onboarding DAGs are triggered simultaneously.

By default, Airflow allocates 10 GB of disk space in the PVC mount directory (/opt/airflow/mount/). When multiple onboarding DAGs are triggered simultaneously, excessive logs are generated thereby filling up the disk space and potentially causing the Airflow scheduler pods to crash.

Workaround: To avoid this issue, we recommend that you limit onboarding to 100 devices over a 12-hour period. In case of disk space issue, use the following procedure to clean up the logs:

Delete all log files from /opt/airflow/mount/logs/*.

Restart the airflow-scheduler pods.

If you try to adopt Cisco devices using the GUI or try to register Cisco devices in bulk using the REST APIs then the system does not trigger a connection to the devices. Therefore, the device status is shown as disconnected.

Workaround: You can add Cisco devices using the Register Device REST API.
You cannot release a Cisco device using the Release Router option. This issue occurs if you have added the Cisco device to Paragon Automation from the Inventory page (Inventory > Devices > Network Inventory) and you have used non-alphanumeric characters for the MAC address.

Workaround: None.

Onboarding an ACX7024 device fails with the following error:

task_failure, Failed to launch Software update. Timed out or waiting for the launch message. Try again later.

Workaround: Restart the onboarding process using the service orchestration cMGD CLI. Perform the following steps:

Log into the primary node using SSH.
Exit out of the default paragon CLI to the Linux root shell.
Log in to the service orchestration CMGD CLI.
Get the organization UUID from the GUI. To find your organization's UUID, go to Administration > Settings in Paragon Automation and then copy the organization's UUID.
Configure the organization UUID in the cMGD CLI:

Retrieve the service order.

Reset the node state.

Restart the onboarding by restarting the connection.

Observability

The Hardware accordion does not display the following information for the listed devices:

Chassis temperature (chassis-temperature) for MX204, MX240, MX304, MX10004, MX10008, and MX10016 devices.
Charts related to the fan speed (rpm-percent) for MX480, MX960, MX10004, MX10008, and MX10016 devices.
Power supply module temperature (psm-temperature) for MX204, MX480, and MX960 devices.

Line card charts for some ACX Series and MX Series devices as the flexible PIC concentrator (FPC) fields are not supported on these devices. See Table 1 for more information.

Table 1: Line Card Charts Support
Device Family	Device Series	FPC Fields Not Supported
ACX Series	ACX7100-32C, ACX7100-48L, ACX7024, ACX7024X, ACX7509, ACX7348	fpc-temperature, fpc-cpu-utilization, fpc-buffer-memory-utilization
MX Series	MX204, MX240, MX304, MX480, MX960, MX10004, MX10008, MX10016	fpc-temperature, fpc-cpu-utilization

On the Interfaces accordion, forward error correction (FEC) corrected errors and FEC uncorrected errors charts are available only on interfaces that support speeds equal to or greater than 100-Gbps.
After you apply a new configuration for a device, the Active Configuration for Device-Name page (Observability> Troubleshoot Device > Device-Name > Configuration accordion > View active config link) does not display the latest configuration immediately. It takes several minutes for the latest changes to be reflected on the Active Configuration for Device-Name page.

Workaround: You can verify whether the new configurations are applied to the device by logging in to the device using CLI.
The graphs related to cyclic redundancy check (CRC) errors display No Results Found as the CRC errors-related data is not streamed from the devices for the management ports.

Workaround: None.
If a device is discovered through a BGP-LS peering session even before you onboard the device, then duplicate LSPs are created when a PCEP session is established with the device. In some rare cases, the duplicate LSPs may remain.

Workaround: If you see duplicate LSPs, restart the EdgeAdapter pod.
For PTX10001, PRX10004, and PTX10016 devices, the PSM Temperature graph on the Hardware Details for Device-Name page (Observability > Troubleshoot Devices > Device-Name > Hardware accordion > PSUs) does not display any data.

Workaround: None.
After the primary node is switched off, the OC-term and GNMI state on the Remote Management accordion (Observability > Troubleshoot Devices > Device-Name) are displayed as disconnected.

Workaround: You can do one of the following:
- Offboard and onboard the devices, or
- Restart the OC-term pod.
Sometimes, the graph on the Output Traffic Details for Device-Name page (Observability > Health >Troubleshoot Devices > Device-Name > Overview > Interfaces (accordion) > Output Traffic data link) displays a sudden spike in data. This issue occurs because some devices might erroneously send data more than once with the same time stamp.

Workaround: None.
Sometimes, data points on the graphs on the Input/Output Traffic Details for Device-Name pages (Observability > Health >Troubleshoot Devices > Device-Name > Overview > Interfaces (accordion) > Input/Output Traffic data link) are displayed as critical even though the traffic is below the threshold level.

When the LACP state of a child interface is down, the aggregated interface speed also comes down based on how many child interfaces are active at the point of data collection. However, when the LACP state of the child interface is up, the aggregated interface speed also increases. Because the child interface LACP state is toggling, the aggregated interface speed also toggles and alerts are raised as the input/output traffic utilization is beyond the threshold with the reduced speed. But in the graph, since the latest thresholds are plotted, alerts might be raised even though the traffic is below the threshold.

Workaround: You can check any flaps on aggregated or child interface on the Observability > Health >Troubleshoot Devices > Device-Name > Overview > Interfaces (accordion) > Interfaces > Link Flap performance graph. In addition, you can also execute the run show LACP interfaces command to check the interface flap data and correlate these issues with the child interfaces LACP states being flapped.
The number of unhealthy devices listed on the Troubleshoot Devices and Health Dashboard pages (Observability > Health) do not match.

Workaround: None.
You cannot delete unwanted nodes and links from the Paragon Automation GUI.

Workaround: Use the following REST APIs to delete nodes and links:
- REST API to delete a link:
  
  [DELETE] https://{{server_ip}}/topology/api/v1/orgs/{{org_id}}/{{topo_id}}/links/{{link_id}}
  
  Note:
  You can follow the steps described here to get the actual URL.
  
  For examples,
  - URL: 'https://10.56.3.16/topology/api/v1/orgs/f9e9235b-37f1-43e7-9153-e88350ed1e15/10/links/15'
  - Curl:
- REST API to delete a node:
  
  [DELETE] https:// {{Server_IP}}/topology/api/v1/orgs/{{Org_ID}}/{{Topo_ID}}/nodes/{{Node_ID}}
  
  Note:
  You can follow the steps described here to get the actual URL.
  
  For examples,
  - URL: ' https://10.56.3.16/topology/api/v1/orgs/f9e9235b-37f1-43e7-9153-e88350ed1e15/10/nodes/1'
  - Curl:
  Use the following procedure to get the actual URL that you use in CURL for deleting a link or a node:
  1. Navigate to the Topology page (Observability > Topology).
  2. Open the developer tool in the browser by using the CTRL + Shift + I buttons in the keyboard.
  3. In the developers tool, select Network and select the XHR filter option.
  4. Identify the link index number or node number. To identify the link index number ot the node number:
    1. On the Topology page of the Paragon Automation GUI, double click the link or the node that you want to delete.
      
      The Link Link-Name page or the Node Node-Name page appears.
    2. Navigate to the Details tab and note the link index number or the node number that is displayed.
  5. In the developers tool, select and click the row based on the link index number or the node number that is related to the link or the node that you want to delete.
  6. Copy the URL that you need to use to delete the link or node in CURL.
- The values for Input Utilization and Output Utilization are represented in terms of percentage of interface utilization instead of Mbps.
  
  This issue is seen in Physical Interfaces Details for Service-Instance-Name tab (Orchestration > Instances > Service Instances > Service-Instance-Name hyperlink > Service-Instance-Name Details > Passive Assurance ) and Input Traffic Details for Device-Name page (Observability > Health > Troubleshoot Devices > Device-Name > Overview > Interfaces (accordion) > Input Traffic data-link).
  
  Workaround: To get the actual traffic rate value, you should multiply the values displayed by interface speed.

Not all optics modules support all the optics-related KPIs. See Table 2 for more information.

Workaround: None.

Table 2: KPIs Supported for Optics Modules
Module	Rx Loss of Signal KPI	Tx Loss of Signal KPI	Laser Disabled KPI
SFP optics	No	No	No
CFP optics	Yes	No	No
CFP_LH_ACO optics	Yes	No	No
QSFP optics	Yes	Yes	Yes
CXP optics	Yes	Yes	No
XFP optics	No	No	No

The Connectivity accordion (Observability > Troubleshooting Devices > Device-Name) shows no data although connectivity tests are properly run and passed.

Workaround: For the connectivity accordion to display data, use the following REST API:

`/active-assurance/api/v2/orgs/${org_id}/streams?filter=measurement.metadata.tags.__sys__test_execution_id:'${t.id}'`

Service Orchestration

The "vpn_svc_type" service type is displayed as "pbb-evpn" instead of "evpn-mpls" on the Paragon Automation GUI and through the REST API.

Workaround: None.
The following limitations are seen when you use the service orchestration cMGD CLI to modify the placement-interface information of an L3VPN service:
- The initial placement-interface options that were populated when the service order was created are not displayed.
- You can select the interface for the site access from all the interfaces present on the CE or PE device.
- When you modify the PE topology and the available ports in the topology, you must:
  1. Delete the existing placement-interface and placement-options from the site network access by using either REST API or the service orchestration cMGD CLI.
  2. Execute the request service order modify command to regenerate the service order with the modified values for the placement-options.
Sometimes, the apply insights configuration (appy_insights-config) fails if you try to provision a service without properly deleting a previously provisioned service or a device.

For example, if you release the router without off-boarding or deleting a service, then the apply insights configuration fails when the same service or device is used in another organization.

Workaround:
- If there are stale services and devices, run the following REST APIs from the cMGD container of the foghorn namespace to delete stale services and devices, and rerun the workflow:
  - curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/services/device-group/<device-group> name>/
  - curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/device-group/<device-group> name>/
  - curl --request POST http://config-server.healthbot:9000/api/v2/config/configuration/
- If there are stale network-groups, run the following REST APIs from the cMGD container of the foghorn namespace to delete the stale network-groups, and rerun the workflow:
  - curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/services/device-group/<network-group> name>/
  - curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/device-group/<network-group> name>/
  - curl --request POST http://config-server.healthbot:9000/api/v2/config/configuration/
While configuring an EVPN service order, the GUI does not throw a validation error even if you specify a value that is equal to 1 Tbps for CBS and CIR fields.

Workaround: Based on your topology, ensure that you specify the right values for CBS and CIR fields.
The EVPN service order creation fails if you try to create an EVPN service order by importing an existing JavaScript Object Notation (JSON) file.

Workaround: If you are using a JSON file, ensure that you clear the placement section before you publish the service order.
For some devices such as ACX7204, if you configure VLANs on unused ports, the following error occurs:

VLAN must be specified on tagged interfaces.

Workaround: This issue is caused by the default factory configuration on the port. Delete the default factory configuration on the ports that you plan to use.
For an MX 240 device, the OSPF-related data is not populated on the Passive Assurance tab (Orchestration > Instances > Service-Order-Name Details).

Workaround: Configure OSPF on the customer edge (CE) device.
Although multiple VLAN IDs are available in the topology resources, the Placement section of the EVPN service order lists only one VLAN ID in the drop down.

Workaround: To fix this issue:
1. Edit the EVPN service order to add new VLAN IDs. You can add the VLAN IDs under the Tagged Interface section.
2. Clear the Placement section by deselecting the device name.
3. Save and publish the service order.
While modifying a service order, you cannot clear the existing placements.

Workaround: If publishing the service order fails due to existing placements, you can export the failed service order to JSON format and then create a new service order or modify an existing service order by importing this JSON file. During the importing process, delete the placements and then publish the service order.
While creating or modifying an EVPN service order, you cannot configure multiple VLAN IDs on the Aggregated Ethernet (AE) interface. The EVPN considers the AE port as a single resource and therefore an AE interface cannot be reused across service instances even when the VLAN IDs on the AE IFL differ.

Workaround: None.
Scheduling provisioning of service orders is a Beta feature in Release 2.1.0. Except in fresh installations, scheduling may not work consistently.

Workaround: None.
VLAN drop down under Placement section doesn't display all the available VLANs as per the topology service order and it shows only the selected VLAN.

Workaround: None.
While modifying a resource instance, if you update VLAN with a value higher than the current specified value then the Modify Resource Instance operation fails.

Workaround: None.
The service order fails with the error message, Invalid XML document, namespace is missing.

Workaround: On the device with the failed configuration, you should turn off system services netconf rfc-compliant and system services netconf notification knobs.
During device onboarding, pings from the device to Microsoft Azure and Google Cloud Platform endpoints fail.

Workaround: Instead of Microsoft Azure and Google Cloud Platform, use Amazon Web Services (AWS) as the endpoint.
While creating an EVPN service order, if the MAC Address Limit that you have specified is out of the defined range then the service order fails.

Workaround: Specify a value that is within the defined range, and then republish the service order.
While creating or modifying an EVPN service order, the MAC Address Limit configuration is ignored if you specify the action to be taken as Drop when the upper limit for customer MAC addresses exceed.

Workaround: None.
Since you can manage resources from within network implementation plans. a blank topology instance (topo) is auto-created on the Resources Instances page (Orchestration > Service > Resource Instances). This topology instance is a read-only instance and is owned by the network operator.

Workaround: None.
If you have referenced a device in a service order and if you try to release the router from the organization, then you cannot delete the service.

Workaround: We recommend that you deprovision the service and offboard the device before you release the router from the Network Inventory page.
When you click the Refresh icon on the Service-Instance-Name Details page (Orchestration > Instances > Service-Instance-Name), you may not see the latest events in the Relevant Events section.

Workaround: To view the latest events, instead of using the refresh icon go to the Service Instance page (Orchestration > Instances) and select the service instance for which you need to see the latest events.
After you upgrade Paragon Automation from Release 2.1.0 to Release 2.2.0, L2VPN and L3VPN accordions (Orchestration > Instances > Service Instances > Service-Instance-Name hyperlink > Service-Instance-Name Details > Passive Assurance tab) show no data for service instances that were provisioned in Release 2.1.0. This issue does not occur if you install Paragon Automation afresh.

Workaround: You need to recreate the VPN service after you upgrade to Juniper Paragon Automation Release 2.2.0.
On the Customer Inventory page (Orchestration > Customers), when you click the View Instances hyperlink to view the service instances provisioned for the customer, a filter for the Customer field is automatically applied on the Service Instances page. Since filtering is not supported for the customer field, you may not see any data.

Workaround: Clear the filter by clicking the X icon in the advanced filtering section. You can then view all the service instances provisioned for the customer.
While creating an L3VPN service instance if you click Reset Placements, the resources are cleared from the UI. However, these resources are not released from the back end.

Workaround: To clear any reserved resources, you can either provision or deprovision the service.
When you click Update Placements while modifying services, a confirmation message appears that the placements are successfully updated even though the REST API returns an error.

Workaround: None.
You must configure access diversity while creating an EVPN service order with all-active or single-active redundancy modes. However, the Access Diversity section is not populated in the UI.

Workaround: None.
The order state of a network implementation plan does not reflect the onboarding status of individual devices in the plan. The order status may show as a success even if the device onboarding has failed.

Workaround: We recommend that you view the details of a network implementation plan so that you can see the onboarding status of individual devices.

To view the details of a network implementation plan, you can select the network implementation plan on the Network Implementation Plan page (Inventory > Device Onboarding) and click More > Detail. Alternatively, hover over the plan name and click the Details icon that appears.
While modifying an existing L3VPN instance, if you try to remove a device that is already a part of the network implementation plan then the modify workflow fails.

Workaround: You must delete all paa-* groups and apply-groups from all the devices used in the instance.
After you upgrade Paragon Automation from Release 2.1.0 to Release 2.2.0, the vpn resource instance does not appear on the Resource Instances page (Orchestration > Service > Resource Instances). This issue occurs because the vpn service design has been renamed to vpn-resources in Release 2.2.0.

Workaround: We recommend that you delete the vpn resource instance, and create a new resource instance, vpn-resources. Ensure that thevpn-resources resource instance has the same data as that of the vpn resource instance.
While provisioning an EVPN service order, if you mark an access interface as untagged, the placement may still pick up a tagged interface from interface resources defined in the topology file.

For example, if et-0/0/2 is defined as untagged and et-0/0/3 is defined as a tagged interface in the topology resources, then the service order sets the access interface as untagged and the automatic placement may still pick up the tagged et-0/0/3 interface.

Workaround: Go to the placement section and manually choose the desired untagged interface from the drop down.
If you modify an L3VPN service to update the routing protocol then the changes are not reflected in the L3VPN accordion (Orchestration > Instances > Service Instances > Service-Instance-Name hyperlink > Passive Assurance).

Workaround: Instead of modifying the existing service, create a new service with the new routing protocol.
Scheduling provisioning of service orders does not work if you upgrade Paragon Automation to Release 2.2.0.

Workaround: None.
The Order History tab on the L3VPN-Name Details page (Orchestration > Instances > Service-Instance-Name hyperlink) lists all the order history even if the service instance is deprovisioned and provisioned again.

Workaround: None.
We recommend that you do not enable LDP for IPv6 for the L2 circuit.

Workaround: None.
Sometimes, the Interface Status column on the Passive Assurance tab of the L3VPN-Name Details page (Orchestration > Instances > Service-Instance-Name hyperlink) may not display data. Occasionally, you may not see any data on the Interface Status graph.

Workaround: None.

Active Assurance

On the Tests page (Observability > Active Assurance), the Test summary that is displayed on the info card is not based on the time range that you have selected. Instead, the Test summary is based on the number of Tests listed on a specific page.

Workaround: None.
Sometimes you may notice that the previously-available Test Agents and previously-provisioned Monitors are not shown in the GUI. In addition, the performance tests are timing out and the measurements are stopped.

Workaround: None.
You cannot delete a Test from the Test-Name page (Observability > Active Assurance > Test > Test-Name > Test-Name > More > Delete). The following error is displayed and the Test that you tried to delete is retained:

Failed to delete Test

Workaround: None.
The status of a Test Agent is shown as offline after the device's Routing Engine switches over from the primary Routing Engine to the backup Routing Engine, or vice versa.

Workaround: Reinstall Test Agent after the Routing Engine switchover.
The streams are not generated when you create a Test with a DNS plug-in and the following event is raised:
Could not get nameserver from resolv.conf
This issue occurs when the Test is associated with a Test Agent that runs on a Juniper Networks router with Junos OS EVO installed, and you don't specify the Name Server field while configuring a Test.

Workaround: Ensure that you specify a value for the Name Server field while configuring a Test.

Trust

There are no known issues in this release.

Administration

There are no known issues in this release.

Installation and Upgrade

The backup and restore functionality has the following caveats:
- You cannot restore data backed up from a setup with a different release of Paragon Automation to a setup with the current release.
- Paragon Automation backs up only application configurations such as devices, sites, service orders, and so on. Since a backup does not store the certificates and infrastructure services configurations, that information must be kept unchanged during restoration.
- Resources allocated to the network won’t be preserved after a restore and you must ensure that you release the allocated resources during the window between taking a backup and performing a restore.
Workaround: None.
If the PCE Server VIP address is not configured, kube-proxy is set to a random port.

Workaround: Configure the PCE Server VIP address.
If a node in the cluster is not operational, the status of the vector pod from the node that is not operational is displayed as Running, even though the node status is reported as Not Ready. This is due to an existing Kubernetes issue. See https://github.com/kubernetes/kubernetes/issues/117769.

Workaround: You can do one of the following:
- Monitor the metric, kube_daemonset_status_number_ready. When the value for this metric drops to three, you can manually check from which vector the data is missing.
- Set a query and an alert for the kube_daemonset_status_number_ready metric in Grafana.
You might encounter RKE2-related issues if you change the hostname after you set up a cluster.

We recommend that you do not change the hostname after a cluster is set up.

Workaround: None.
When the worker node is down, there might be issues if you create an organization or onboard a device.

Workaround: Do not create an organization or onboard a device when a worker node is down. You must wait until the cluster recovers and then create an organization or onboard a device. Recovered state is when all the pods are either in Running or Pending state and are not in any intermediate states like Terminating, CrashloopbackOff, and so on.
If you have powered off one of the primary nodes, you may not be able to log in to the Juniper Paragon Automation GUI.

Workaround: Restart papi-ws using the following Paragon Shell CLI command:

request paragon cluster pods reset service papi-ws namespace papi operation restart

Upgrading from Release 2.1.0 to Release 2.2.0 on VMs configured with minimum required specifications might fail due to a failed airflow-worker instance. Verify the reason for failure.

Workaround: Perform the following steps:

Lower the following CPU resource requests from the Linux root shell.
# kubectl patch deployment -n mems mems --type "json" -p '[{"op":"add","path":"/spec/template/spec/containers/0/resources/requests/cpu","value":"0.5m"}]'# kubectl patch deployment -n tagify tagify --type "json" -p '[{"op":"add","path":"/spec/template/spec/containers/0/resources/requests/cpu","value":"0.5m"}]'
Rerun upgrade.

On rare occasions, e-mails might not be sent out due to Kafka connection issues. The mail service logs display the following error:
"message":"dial tcp 10.x.x.43:9092: connect: connection refused"
This issue can occur anytime, but more commonly when the cluster has been upgraded or when a cluster node has been rebooted.

Workaround: Restart the mail server from the Linux root shell.
kubectl delete pods -n common mailservice-xxxxxxxxxx-xxxxx

ON THIS PAGE