error
Syntax
error { error_id major { threshold threshold-value; action } minor { threshold threshold-value; action } fatal { threshold threshold-value; action } scope error-scope { category category { severity-level { threshold threshold-value; action } } } reset-pfe { pause-period <pause_minutes>; pfe-disable-period <pfe_disable_minutes>; retry-limit <retry_number>; } }
Junos OS
[edit chassis]
[edit chassis fpc slot-number]
Junos OS Evolved
[edit chassis fpc slot-number]
[edit chassis sib slot slot-number]
Description
Configure the threshold at which FPC or SIB errors will take the action you configure to be performed by the device. Starting from Junos OS Release 18.1R3, you can configure error thresholds and actions at the error scope and error category levels on MX Series routers.
Some Juniper devices include an internal framework for detecting and correcting FPC errors that can have the potential to affect services. You can classify the errors according to severity, set an automatic recovery action for each severity, and set a threshold (i.e., the number of times the error must occur before the action is triggered). Using the configuration command for error levels, you can isolate PFE errors, thereby reducing the need for a field replacement of FPCs. For feature details and PTX platforms supported, see FPC self-healing.
On the MX104 routers, Junos does not initiate restart of the system on encountering a Fatal error. Additionally, though you can configure the action
disable-pfe
for Major errors on the MX104, the router does not disable its only PFE on encountering a Major error.
Options
You can configure the threshold for the following severity levels:
fatal
—Fatal error on the FPC. An error that results in blockage of considerable amount of traffic across modules is a fatal error.major
—Major error on the FPC. An error that results in continuing loss of packet traffic but does not affect other modules is a major error.minor
—Minor error on the FPC. An error that results in the loss of a single packet but is fully recoverable is a minor error.
threshold threshold-value
—Configure the threshold value at which to take action. If the severity level of the error is fatal, the action is carried out only once when the total number of errors crosses the threshold value. If the severity level of the error is major, the action is carried out once after the occurrence crosses the threshold. If the severity level is minor, the action is carried out as many times as the value specified by the threshold. For example, when the severity level is minor, and you have configured the threshold value as 10, the action is carried out after the tenth occurrence. On Junos OS Evolved, for the errors belonging to theinternal
category, the default threshold value is 1.Note:You can set the threshold value to 0 for errors with severity level as minor. This implies that no action is taken for that error. You cannot set the threshold value to 0 for errors with severity level as major or fatal.
Default: The error count for fatal and major actions is 1. The default error count for minor actions is 10.
Range: 0—429,496,729
The available detection and recovery actions are as follows:
alarm
—Raise an alarm.disable-pfe
—Disable the PFE interfaces on the FPC.get-state
—Get the current state of the FPC.log
—Generate a log for the event.offline
—Take the FPC offline.offline-pic
—Take the PIC (installed in the FPC) offline.reset
—Reset the FPC.reset-pfe
—Reset the PFE. (Supported for PTX series devices including PTX10003)offline-pfe
—Take the PFE offline.trap
—Raise traps for the FPC errors.
Starting in Junos OS Evolved Release 19.1R1, the offline
and disable-pfe
actions are not available for errors with minor
severity (under the hierarchy edit chassis error minor action
).
Starting in Junos OS Release 21.4R3, the additional options for reset PFE are valid only for line cards MPC7, MPC8, and MPC9, for the platforms MX240, MX480, MX960, MX2008, MX2010, and MX2020.
The available detection and recovery actions are as follows for devices running Junos OS Evolved:
alarm
—Raise an alarm.fault
—System goes to fault state but stays up (diagnostics can be run on it).get-state
—Get the current state of the FPC.log
—Generate a log for the event.
Starting in Junos OS Release 17.2R1, if you configure the disable-pfe
, offline
, offline-pic
or reset
action on an MX Series or PTX Series router, the get-state
action is additionally configured on the router. This means, for example, if you configure the disable-pfe
action on the router, the router gets both disable-pfe
and get-state
actions configured.
scope error-scope
—Group the errors of a particular severity into different scopes. Errors belonging to each error scope is further grouped into categories, before thresholds and actions are defined at the group level. The following scopes are available:board
andpfe
. Junos OS Evolved also supports the scopeswitch
.category category
—Categorize errors into various subgroups under the scope level. An error category helps you group similar errors belonging to a particular scope and define actions for them at once. This feature eliminates the need for configurations against individual error-ids. Some of the error-categories arefunctional
,io
(input/output errors),storage
(for example, errors related to HDD, SSD, and flash),memory
(for example, errors related to static RAM),processing
(for example, CPU-related errors), andswitch
. Junos OS Evolved also supports the categoryinternal
. On every occurrence of an error belonging to theinternal
category, the software by default raises an alarm at the individual error level (not at the scope or category level). You cannot configure an action against errors belonging to theinternal
category.severity-level – Configure the severity levels associated to each
error-id
. The options arefatal
,major
andminor
.error-id
—Use the error ID to disable an error or modify the error severity associated with that error. An error-id, which is a unique error identifier, is represented as a Uniform Resource Identifier (URI). For example,/cpu/0/memory/0/memory-uncorrected-error
is an error ID that indicates an uncorrectable error under CPU memory module instance0
.reset-pfe
—Configure thresholds associated to reset-pfe action.pause-period <pause_minutes>
— Pause period in minutes. Valid range is 0 to 10000000.pfe-disable-period <pfe_disable_minutes>
— PFE disable period in minutes before reset PFE. Valid range is 1 to 10000000. The PFE disable period must be greater than the pause period.retry limit <retry_number>
— Retry limit for reset PFE. Valid range is 0 to 3.
Required Privilege Level
interface | To view this statement in the configuration. |
interface-control | To add this statement to the configuration. |
Release Information
Statement introduced in Junos OS Release 13.3.
Reset-pfe error action support added to PTX10003 in Junos OS Evolved Release 22.4R1.