Setting up Disaster Recovery

Disaster Recovery (DR) involves a procedure to enable the recovery or continuation of system following a disaster.

Harmony disaster recovery utility is used when there are requirements for users to have a DR setup, when the primary data center fails to recover the disaster recovery setup running on a data center.

Initial Primary and Disaster Recovery Site Configuration

During the Disaster Recovery process, the following process is performed to ensure recovery:

  • Set up two identical (similar hardware, software version) Harmony Controller environments, one as primary and other as secondary.

  • Configure SSL certificates on both primary and secondary Harmony Controller environments using the same FQDN using A10 Harmony Controller Operator Console in the Certificate section under Configuration Management.

Note

Rsync port number by (default port 22) must be opened across primary and DR Harmony Controller.

The diagram below shows prior to DR:

_images/dr01.png

The diagram below shows after DR:

_images/dr02.png

Set-up Procedure

  1. Assuming the primary Harmony Controller is already setup, deploy the DR Harmony Controller with similar hardware, software version. After deploying with IP addresses, ensure to import the SSL certificate and configure FQDN with Operator Console.

  2. Run the metrics_backup_setup script on both the servers, only when the analytics needs to be backed-up and restored:

  • For root users:

    cd /home/admin/upgrade/a10-harmony-controller-5.1.0p2/utilities ./metrics_backup_setup.sh

  • For non-root users:

    cd /home/admin/upgrade/a10-harmony-controller-5.1.0p2/utilities sudo ./metrics_backup_setup.sh

Note

Run the metrics_backup_setup.sh utility for both primary and secondary Harmony Controller setup.

  1. Update the DNS for routing the requests to primary.

  2. Register the Thunder device with FQDN.

  3. Configure DR on primary with DR setup utility.

Set-up Disaster Recovery

  1. Log in to the primary Harmony Controller environment either as root or non-root user with root privileges on node0.

  2. Go to the Harmony Controller installation folder.

    For example:

    cd /home/admin/offline/a10-harmony-controller-5.1.0/utilities
    

    Note

    Keep following information ready.

    • IP address of DR Harmony Controller Node0

    • User Name to access DR Harmony Controller Node0

    • (Optional) Password for DR Harmony Controller. If key based authentication is not available, you need to keep the password of root user or user with root privileges for DR Harmony Controller environment of node0.

  3. Run the DR utility and follow the on-screen instructions:

    • For root user:

      ./dr_setup.sh
      
    • For non-root user:

      sudo ./dr_setup.sh
      

    Note

    For a key-based login, SSH keys can be set automatically while running the setup. Alternatively, you can use existing SSH keys manually.

  4. Choose one of the following deployment methods:

    • Another Harmony Controller deployment in stopped state

    • Another linux machine

    Code sample:

    [root@harmony-release-5-3-0-p1 utilities]# ./dr_setup.sh
    Logs will be witten to /a10harmony/logs/harmony-backup/drsetup-2021-08-09-12-51-52.log
    
    ********************************************************************************
             Welcome to Harmony Controller Disaster Recovery Setup
    ********************************************************************************
    
    For Disaster Recovery, Harmony Controller backs up the data at a passive location. This passive location can be:
        1.  Another Harmony Controller deployment in stopped state
        2.  Another linux machine
    
    More details can be found in Harmony Controller documentation.
    
    
    Provide the method you want to use for DR setup (1/2)[1]:2
    Enter IP address of remote machine: 129.146.72.41
    Enter path where backups should be copied [/a10harmony/]:
    Enter username using which this Harmony Controller can copy backups to remote machine[root]: opc
    [Info] user opc needs to have a passwordless sudo access on 129.146.72.41 .
    
     SSH Key setup is required between the two controllers.
     If the user [opc] can login to remote machine [129.146.72.41] using password, automatic SSH key setup is possible.  You need to type the SSH password when prompted. The password will not be saved.
    
     Else, SSH key setup is to be done manually
    
     Do you want to go for automatic SSH key setup (y/n)[y]:: n
    
     Manual SSH Key Setup:
     Step 1: Copy the public key of this controller (excluding the leading/trailing spaces and empty lines) from file ~/.ssh/id_rsa.pub. If id_rsa.pub file is missing or empty, create one using 'ssh-keygen' utility and copy the contents.
     Step 2: Login (SSH) to remote machine [129.146.72.41] as user [opc]
     Step 3: Open file ~/.ssh/authorized_keys using vi editor:
      vi ~/.ssh/authorized_keys
    
     Step 4: Paste the public key copied in step 1 into the end of file
     Step 5: Save the file and exit
    
     Press key (y) after above steps are done ……:y
     configmap/dr-config configured
     configmap/hc-backup configured
     Setting up Harmony Controller data backup to the remote Linux machine ........ [Done]
         ******** Congratulation! Disaster Recovery Setup Completed Successfully ********
    

DR enables the copy of periodical backup from the primary Harmony Controller node0 to the DR Harmony Controller node0 or a remote linux machine. Periodic backup runs every hour and retains 10-hour, 7-day, 5-week, or 3-month backup as part of DR. You can use any of these four backups as required.

Note

In case of three node controller deployment when node0 is down, Harmony Controller will not go down. But the disaster recovery utility stops working until node0 is operational.

Best Practices of Disaster Recovery

List of best practices to be followed during disaster recovery:

  • Periodical verification of the logs to ensure the operations are working without any issues.

  • Perform drill to ensure the DR process is fully working along with the process.

  • Create and maintain the Standard Operation Procedure (SOP).

Data Restore at Disaster Recovery Site

The restore utility can run to update the latest available data in the DR system:

  1. Log in to the secondary Harmony Controller environment either as root or non-root user with root privileges on node0.

  2. Go to the Harmony Controller Installation folder.

    For example:

    cd /home/admin/offline/a10-harmony-controller-5.1.0/utilities
    
  3. If backup is on the linux machine, copy that backup in this installation folder.

  4. Run Harmony restore utility:

    1. If metrics back-up is enabled:

      For root user:

      ./harmony_restore.sh --metrics=yes
      

      For non-root user:

      sudo ./harmony_restore.sh --metrics=yes
      
    2. If metrics back-up is not enabled:

      For root user:

      ./harmony_restore.sh --metrics=no
      

      For non-root user:

      sudo ./harmony_restore.sh --metrics=no
      

Procedure at the Time of Disaster

Pre-Requisites

If any of the nodes are operational, bring down all the nodes.

Procedure

When the primary Harmony Controller is not available, the following steps are performed to recover the DR Harmony Controller:

  • Follow the steps in the Best Practices of Disaster Recovery.

  • Change the DNS record for the FQDN to point to the IP address of the secondary Harmony Controller. It takes a few minutes.

  • Verify the devices are registered automatically in the DR controller and all operations are working satisfactorily.

  • The metrics data may take time to have the historical data showing up in the dashboard, if the dataset is of large size.

  • Log in to the Harmony Controller using FQDN and verify the configuration and metrics.

  • Secondary Harmony Controller is the active Harmony Controller system.

Disaster Recovery Metrics

The utility does rotational retention based on the configuration. By default, six last backups are retained. The directory structure is synchronized with the target system, either disaster recovery Harmony Backup (or) optional storage server, for the period configured in the crontab, by default one hour.

The metrics can be of high volume depending on the deployments and hence, it may take some time for the metrics database to populate and show in the dashboard. Typical metrics recovery time is 3GB per minute.

Recovery Point Objective (RPO) is the duration of time the amount of data is lost and Recover Time Objective (RTO) is the duration of time the service is restored.

The Disaster Recovery Metrics for Harmony Controller:

  • In case of hourly backup, the RPO is maximum 1 hour.

  • Service recovery time for:

    • Powering up the DR system to service

    • Response to the issue

    • Restoration of backup

    • Moving the FQDN.

      For example for a RTO:

      • 10 minutes to startup

      • 10 minutes to respond

      • 5 minutes to recover from backup

      • 5 minutes for FQDN switch over

      Total time is 30 min.