Manage Disaster Recovery
Configuration of Disaster Recovery (DR) between an active site and a standby site ensures geographical redundancy of network management services.
Before you initiate the DR process between both sites, perform the following tasks:
Ensure that the connectivity requirements as described in the Disaster Recovery Overview topic are met.
Check whether identical cluster configurations exist on both sites. We recommend that both clusters have the same number of nodes so that, even in the case of a disaster, the standby site can operate with the same capacity as the active site.
Ensure that the same versions of Junos Space Network Management Platform, high-level Junos Space applications, and device adapters are installed at both sites.
Shut down the DR process configured on Junos Space Network Management Platform Release 14.1R3 and earlier before upgrading to Junos Space Network Management Platform Release 15.2R1 and configuring the new DR process. For more information, see Stopping the Disaster Recovery Process on Junos Space Network Management Platform Release 14.1R3 and Earlier.
You cannot configure the new DR process if you do not stop the DR you set up on 14.1R3 and earlier releases. You do not need to perform this step on a clean installation of Junos Space Network Management Platform Release 15.2R1.
Ensure that the same SMTP server configuration exists on both sites to receive e-mail alerts related to the DR process. You can add SMTP servers from the SMTP Servers task group in the Administration workspace. For more information about adding SMTP servers, see Adding an SMTP Server.
To configure Disaster Recovery:
Select Administration > Disaster Recovery > Manage Disaster Recovery.
The Configure Disaster Recovery Wizard page opens.
Enter the required parameters and select one or more devices from the list that you want to validate. See Table 1 for more details on the Configure Disaster Recovery Wizard page.
Field |
Description |
---|---|
Site Role |
Select an option for which you want to configure the DR. The available options are Active and Standby Site. Note:
Its is mandatory to initiate the DR on the Active Site first followed by Standby Site or else system prompts you to do so. |
Peer Site VIP Address |
Enter a valid IP address for configuration. Note:
You cannot edit this information if the DR is not in the Initialized state. |
Load Balancer’s CLI Admin Password |
Enter a valid admin CLI password. Note:
If you have more than one password, you can enter both separated by a comma. You cannot edit this information if the DR is not in the Initialized state. |
Confirm Password |
Re-enter the previously entered password to configure the DR Wizard. |
Arbitrary Devices |
Select one or more devices from the list of devices used during DR auto failover. You can also search and filter the devices. |
Next |
Select Next to configure Disaster Recovery at the Active Site followed by Standby Site. See Configuring Disaster Recovery at the Active Site and Configuring Disaster Recovery at the Standby Site. It is enabled only when all the parameters are fulfilled. |
Next, the window to configure Disaster Recovery at the Active Site followed by Standby Site gets displayed. For more details, see Configuring Disaster Recovery at the Active Site and Configuring Disaster Recovery at the Standby Site.
The following sections explains the procedure to configure DR at the Active and Standby Sites and initiate the disaster recovery between both sites.
Configuring Disaster Recovery at the Active Site
To configure the Disaster Recovery at the Active Site:
Field |
Description |
---|---|
Peer Site VIP |
Displays the IP address entered in the Configure Disaster Recovery Wizard page. |
Arbitrary Devices |
Displays all the devices that are selected in the Configure Disaster Recovery Wizard page. |
SCP Timeout |
Displays the timeout value to detect a failure in transferring files from standby to active site through Secure Copy Protocol (SCP). The time is dispayed in seconds. Note:
You cannot edit the value if DR is not in the Initialized state. |
Maximum number of backup |
Displays the numbers of files that you want to retain. Note:
You cannot edit the value if DR is not in the Initialized state. |
Backup Schedule Note:
You cannot edit the parameters if DR is not in the Initialized state. |
|
Time of the day (in Hrs) |
The time of the day when you want to schedule the backup. Time is in 24 hours format. |
Days of the week |
The days when you want to schedule the backup. |
Restore Schedule Note:
You cannot edit the parameters if DR is not in the Initialized state. |
|
Time of the day (in Hrs) |
The time of the day to copy files from active site to standby site. Time is in 24 hours format. |
Days of the week |
The days to copy files from active site to standby site. |
Watchdog Note:
You cannot edit the parameters if DR is not in the Initialized state. |
|
Heartbeat retry times |
The number of times the active site should send heartbeat messages to the standby site. It ranges from 4 to 15. |
Heartbeat message timeout |
The timeout value of each heartbeat message in seconds. The maximum and default value is 5. |
Heartbeat message interval |
Displays the time interval between two consecutive heartbeat messages to the standby site in seconds, ranging from 30 seconds to 120 seconds. |
Notification email |
The e-mail address of the administrator to whom e-mail messages about disaster recovery service issues must be sent. |
Notification interval |
The time interval during which the same issues are not reported through e-mail (dampening interval) in seconds. It ranges from 300 to 1800 seconds. |
Failure Detection |
|
Failure detection method |
Displays the method of failure detection. Note:
In Junos Space Network Management Platform 20.3R1, only default option is allowed through GUI. |
Failure detection threshold percentage |
Displays the threshold percentage for failure detection. |
When you have entered values for all parameters, disaster recovery is initialized at the active site.
Configuring Disaster Recovery at the Standby Site
To configure the Disaster Recovery at the Standby Site:
Its mandatory to initialize the Active Site before initializing the Standby Site. Arbitrary devices can be selected only in the Active Site.
Field |
Description |
---|---|
Peer Site VIP |
Displays the IP address entered in the Configure Disaster Recovery Wizard page. |
Arbitrary Devices |
Displays all the devices that are selected in the Configure Disaster Recovery Wizard page. |
SCP Timeout |
Displays the timeout value to detect a failure in transferring files from standby to active site through Secure Copy Protocol (SCP). The time is dispayed in seconds. Note:
You cannot edit the value if DR is not in the Initialized state. |
Maximum number of backup |
Displays the maximum number of backups to retain at the standby site. Note:
You cannot edit the value if DR is not in the Initialized state. |
Backup Schedule Note:
You cannot edit the parameters if DR is not in the Initialized state. |
|
Time of the day (in Hrs) |
The time of the day when you want to schedule the backup. Time is in 24 hours format. |
Days of the week |
The days when you want to schedule the backup. |
Restore Schedule Note:
You cannot edit the parameters if DR is not in the Initialized state. |
|
Time of the day (in Hrs) |
The time of the day to copy files from active site to standby site. Time is in 24 hours format. |
Days of the week |
The days to copy files from active site to standby site. |
When you have entered values for all parameters, disaster recovery is initialized at the standby site.
Actions common for both Active and Standby Site
Table 4 shows the actions common for configuring both Active and Standby Sites.
Field |
Action |
---|---|
Initialize |
Starts the initialization of DR with the given values. This is enabled only when all the parameters are provided with correct vales on both the sites. |
Reset |
Resets the DR configuration. This is enabled only when the DR is already initialized or else stopped. |
Start |
Starts the DR process. This is enabled when the DR is already initialized. |
Stop |
Allows you to stop the configuration on either of the sites or both the sites. |
Manual Failover |
This performs manual fail over on the standby site. This parameter is available only when the DR has started or is stopped. |
Disaster Recovery Health
To check the Disaster Recovery health status: