CloudByte ElastiStor High Availability takeover and giveback scenarios

The following document explains various takeover and giveback scenarios involved in ElastiStor High Availability.

Sample Setup Assumptions

  • Nodes in your ElastiStor Appliance ESA-A50 are called Node1 and Node2.
  • The Nodes are in a HA Group.
  • Node1 is the primary ElastiCenter and Node2 is the secondary ElastiCenter.
  • Node1 hosts Pool1, VSM1 and Storage Volume1.
  • Node2 hosts Pool2, VSM2 and Storage Volume2.

Appliance-Node-details-01

Downtime and takeover scenarios

Note: These are random scenarios and do not occur in sequence.

Scenario Reason/Event System action Corrective measure Result
Node1 is down, Node2 takes over.
  • Reboot
  • Software crash
  • Pool1 is moved to Node2.
  • Node2 is promoted as Primary ElastiCenter.
Bring up Node1 online and move it to Available State.
  • Pool1 is moved back to Node1.
  • Node1 acts as secondary ElastiCenter.
  • Node2 continues to be the primary ElastiCenter.
Node2 is down, Node1 takes over Pool2.
  • Reboot
  • Software crash
  • Pool2 is moved to Node1.
  • Node1 continues to be the primary ElastiCenter.
Bring up Node2 online and move it to Available State.
  • Pool2 is moved back to Node2.
  • Node2 continues to be the secondary ElastiCenter.
Node1 is in Maintenance mode. Administrator wants to service the Node.
  • Pool1 is moved to Node2.
  • Node1 continues to be the Primary ElastiCenter.
Move Node1 to Available State. Pool1 is moved back to Node1.
Node2 is in Maintenance mode. Administrator wants to service the Node.
  • Pool2 is moved to Node1.
  • Node2 continues to be the Secondary ElastiCenter.
Move Node2 to Available State. Pool2 is moved back to Node2.
Management network on Node1 failed.
  • Switch port failed
  • NIC port failed
  • Network cable is plugged out
  • ElastiCenter cannot be accessed.
  • Data sync between the primary and secondary is stopped.
Bring NIC of Node1 online.
  • Node1 continues to be the primary ElastiCenter.
  • Node2 continues to be the secondary ElastiCenter.
  • Data sync resumes between the primary ElastiCenter and secondary ElastiCenter.
Management network on Node2 failed.
  • Switch port failed
  • NIC port failed
  • Network cable is plugged out
  • ElastiCenter can be accessed.
  • Data sync between the primary and secondary is stopped.
Bring NIC of Node2 online.
  • Node1 continues to be the primary ElastiCenter.
  • Node2 continues to be the secondary ElastiCenter.
  • Data sync resumes between the primary ElastiCenter and secondary ElastiCenter.
Data network on Node1 failed.
  • Network switch port failed
  • NIC port failed
  • Cable has been plugged out
Pool1 is moved to Node2. Bring the network of Node1 online.
  • Node1 continues to be the primary ElastiCenter.
  • Node2 continues to be the secondary ElastiCenter.
  • Pool1 is automatically moved back to Node1.
Data network on Node2 failed.
  • Network switch port failed
  • NIC port failed
  • Cable has been plugged out
Pool2 is moved to Node1. Bring the network of Node2 online.
  • Node1 continues to be the primary ElastiCenter.
  • Node2 continues to be the secondary ElastiCenter.
  • Pool2 is automatically moved back to Node2.