Disaster recovery best practices

Backing up your VSM

  • CloudByte recommends you to move the full zipped backup of the VSM to a safe location. Either schedule a cron job or manually copy the files from the DR VSM or source VSM.

Backup Schedule

  • The backup schedule of the DR VSM should be at least 5 minutes.

Pool Size

  • Ensure that the Source Pool of the VSM has enough space to retain data locked by the snapshot(s). Size of the Pool usually depends on the amount of data modified/deleted from the time the snapshot was captured.
  • Ensure that the destination Pool has enough space to host the DR VSM.
  • Plan the DR VSM network settings (interface and IP address) configurations prior to the first DR VSM activation.¬†
  • For block level storage (iSCSI) and especially with databases, CloudByte recommends you to capture application consistent snapshots of the volumes at regular intervals with an appropriate number retention copies.¬†

Network Settings

  • Plan the DR VSM network settings (interface and IP address) configurations prior to the first DR VSM activation.

Snapshots

  • Activating DR VSM fails if the source VSM is still online.
  • To activate the DR VSM, ElastiCenter should be online.
  • After a failover, the source VSM and DR VSMs reverse the roles. You have to manually enable data transfer in ElastiCenter.

DR failover and failback checklist

Source VSM must be offline

Activating DR VSM fails if the source VSM is still online.

ElastiCenter must be online

To activate the DR VSM, ElastiCenter should be online.

Enabling data transfer

After a failover, the source VSM and DR VSMs reverse the roles. You have to manually enable data transfer in ElastiCenter.

 Troubleshooting DR activation process

Problem

Disaster recovery activation process is unsuccessful

Solution

Ensure the following:

  • The backup IP address of the destination Pool is reachable from the source VSM.
  • The VSM IP address is reachable from the destination Node.
  • On the destination Node, a bkpd process is up and running with the Pool backup IP address. You can check this by running the following command:
    sockstat | grep bkpd
  • Verify the error log on the destination node (
    /var/log/bkpd.log
    )
  • On the source Node, check for any errors in the logs available in the VSM. You can check this using the following command :
    jexec jail_id vi /var/log/cbdpd.log
  • The backup IP address of the destination Pool is reachable from the source VSM.
  • The VSM IP address is reachable from the destination Node.

How to check the status of data transfer

  • In ElastiCenter, go to Actions pane > Remote Backup > View Transfer.

remote_backup_click_activate1

Note: Currently, there are no Alerts available to notify data transfer failure.