Abstract:
Synchronous local and cross-site switchover and switchback operations of a node in a disaster recovery (DR) group are described. In one embodiment, during switchover, a takeover node receives a failover request and responsively identifies a first partner node in a first cluster and a second partner node in a second cluster. The first partner node and the takeover node form a first high-availability (HA) group and the second partner node and a third partner node in the second cluster form a second HA group. The first and second HA groups form the DR group and share a storage fabric. The takeover node synchronously restores client access requests associated with a failed partner node at the takeover node.
Abstract:
One or more techniques and/or systems are provided for dynamic mirroring. A first storage node and the second storage node within a first storage cluster may locally mirror data between one another based upon a local failover partnership. The first storage node and a third storage node within a second storage cluster may remotely mirror data between one another based upon a primary disaster recovery partnership. If the third storage node fails, then the first storage node may remotely mirror data to a fourth storage node within the second storage cluster based upon an auxiliary disaster recovery partnership. In this way, data loss protection for the first storage node may be improved, such that the fourth storage node provide clients with access to mirrored data from the first storage node in the event the second storage node and/or the third storage node are unavailable when the first storage node fails.
Abstract:
During a storage redundancy giveback from a first node to a second node following a storage redundancy takeover from the second node by the first node, the second node is initialized in part by receiving a node identification indicator from the second node. The node identification indicator is included in a node advertisement message sent by the second node during a giveback wait phase of the storage redundancy giveback. The node identification indicator includes an intra-cluster node connectivity identifier that is used by the first node to determine whether the second node is an intra-cluster takeover partner. In response to determining that the second node is an intra-cluster takeover partner, the first node completes the giveback of storage resources to the second node.
Abstract:
One or more techniques and/or computing devices are provided for automatic switchover implementation. For example, a first storage controller, of a first storage cluster, may have a disaster recovery relationship with a second storage controller of a second storage cluster. In the event the first storage controller fails, the second storage controller may automatically switchover operation from the first storage controller to the second storage controller for providing clients with failover access to data previously accessible to the clients through the first storage controller. The second storage controller may detect, cross-cluster, a failure of the first storage controller utilizing remote direct memory access (RDMA) read operations to access heartbeat information, heartbeat information stored within a disk mailbox, and/or service processor traps. In this way, the second storage controller may efficiently detect failure of the first storage controller to trigger automatic switchover for non-disruptive client access to data.
Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.
Abstract:
System and method for migrating data from a source storage site to a destination storage site. The data may be comprised within storage objects (e.g., flexible volumes). A base storage object may comprise a parent storage object and a storage object clone may comprise a storage object that is derived from the base storage object. As such, a hierarchical relationship exists between the base storage object and the storage object clone. The storage object clone may comprise a writable point-in-time image of the parent storage object. If a migration of the base storage object and the storage object clone is performed, then the hierarchical relationship between the base storage object and the storage object clone is retained after the storage objects are migrated from the source storage site to the destination storage site. As such, the system and method for migrating data may enable storage space and network bandwidth savings.
Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.
Abstract:
One or more techniques and/or systems are provided for dynamic mirroring. A first storage node and the second storage node within a first storage cluster may locally mirror data between one another based upon a local failover partnership. The first storage node and a third storage node within a second storage cluster may remotely mirror data between one another based upon a primary disaster recovery partnership. If the third storage node fails, then the first storage node may remotely mirror data to a fourth storage node within the second storage cluster based upon an auxiliary disaster recovery partnership. In this way, data loss protection for the first storage node may be improved, such that the fourth storage node provide clients with access to mirrored data from the first storage node in the event the second storage node and/or the third storage node are unavailable when the first storage node fails.