thermal-health-check
Syntax (Junos OS)
thermal-health-check { fet-failure-check { action-onfail (auto-shutdown | none); } action-onfail (auto-shutdown | none); shutdown-timer value; power-threshold value; power-threshold-percent value; }
Syntax (Junos OS Evolved)
thermal-health-check { action-onfail (auto-shutdown | none); power-threshold-percent value; }
Hierarchy Level
[edit chassis]
Description
Enable thermal health check, and configure an action that will respond to the detection of a thermal health event such as power leakage on PTX5K, MX10K, PTX10K and QFX10K devices. The thermal health check feature monitors the PSM power output and FRU power consumption every minute. When the PSM power output exceeds the FRU power consumption by a default or configured threshold for three consecutive iterations, it assumes a thermal health event and takes an action based on user configuration.
The default threshold for QFX10002 devices is 100 W and for other devices is 600 W.
The default action is set to none. You can use the following command to shut down all PSMs when a thermal health check fails:
set chassis thermal-health-check action-onfail auto-shutdown shutdown-timer value (power-threshold | power-threshold-percent) value
On a Junos OS device, you can configure only power-threshold
or
power-threshold-percentage
at a time since they cannot coexist.
You can configure power-threshold-percent
for systems that are
connected to either second or third generation power supplies.
To ensure accurate operation, the thermal health check feature enables the system to shut down the PSMs with a load under 20%. You can expect a margin of error in total FRU input and PSM output power readings compared to actual hardware values.
You must enable the PSM watchdog feature along with thermal-health check to shutdown the system in case a thermal health event causes Junos to go down. Please note that PEM firmware upgrade is required for the thermal health check and PSM watchdog feature.
You can enable the fet-failure-check
option to monitor a failing power
supply due to a Field-effect Transistor (FET) failure and take corrective action. You
can choose to shutdown a reporting PSM if a redundant power supply is available, raise
an alarm and log the events when a risk of thermal event is determined.
The fet-failure-check
option is supported on MX10K and PTX10K
devices that run on Junos OS.
Options
fet-failure-check | Enable FET failure detection, and configure an action to be taken upon FET failure. |
action-onfail | Choose an action to be performed on detection of a thermal health event. The
following options are available:
|
shutdown-timer value | Set the timer in seconds to shutdown the PSMs (Range: 10sec to 15mins). This setting is necessary if the automatic shutdown did not occur during a thermal health event due to a software freeze. The default shutdown timer is 900 seconds, and you have the option to reconfigure it. Please note that if the configured timer value is shorter than the default, then you need to deactivate it before rebooting the system. The reboot delay could cause the timer to expire. The shutdown-timer option is not available on Junos OS Evolved. Instead, the timeout value configured for the PSM watchdog feature serves as the shutdown-timer. |
power-threshold value | Set power threshold value in watts. The default value for power-threshold is 600. Avoid using the default value for systems connected to second or third generation power supplies. |
power-threshold-percent value | Set power threshold value in percent of Total System Power Output (Range: 4 to 10 percent). This value indicates the percentage difference between the parameters: Total System Power Output and Total System FRU Input Power. On Junos OS, you must set a value to configure power-threshold-percentage because the system doesn't provide a default. On Junos OS Evolved, power-threshold-percent value has a default value of 8. |
Alarm |
Description |
Remedy |
Severity |
---|---|---|---|
|
Appears when thermal health check fails upon exceeding the threshold value. If you have configured auto-shutdown, then the system will shut down. |
If the action is set to If the action is set to |
Major |
|
Appears when the load of any active PSM is below 20%. System will shut down these PSMs, provided N+2 redundancy criteria is met. If the redundancy criteria is not met, the thermal health check feature will raise this alarm because it cannot shut down these PSMs. On Junos OS Evolved, the PSM state becomes |
Shut down the less loaded PSMs to increase the load of active PSMs. |
Minor |
|
Appears at PSM-level when the load of any active PSM is below 20%. System shuts down these PSMs, provided N+2 redundancy criteria is met. |
No remedy required for this alarm. Maintaining a load above 20% for all PSMs is recommended to ensure accurate thermal health check feature functionality and shutdown of unnecessary PSMs. |
Minor |
Required Privilege Level
interface—To view this statement in the configuration.
interface-control—To add this statement to the configuration.
Release Information
Statement introduced in Junos OS Release 20.1R1.
fet-failure-check
option introduced in Junos OS Release 21.2R1.