-
公开(公告)号:US20240036996A1
公开(公告)日:2024-02-01
申请号:US17875837
申请日:2022-07-28
Applicant: NetApp, Inc.
Inventor: Anoop Vijayan , Akhil Kaushik , Sohan Shetty , Dhruvil Shah
IPC: G06F11/20
CPC classification number: G06F11/2012 , G06F2201/85
Abstract: Multi-site distributed storage systems and computer-implemented methods are described for improving a resumption time of input/output (I/O) operations during an automatic unplanned failover (AUFO). A computer-implemented method includes determining, with a second storage cluster, whether heartbeat information from one or more storage objects of a CG of a first set of CGs is received during a time period, determining an out of sync state for a data replication relationship between the CG of the first set of CGs and a mirrored CG of a second set of CGs when the heartbeat information is not received during the time period and sending a single bulk role change call with a cluster identifier from the second cluster to an external mediator to provide a role change from follower to leader in the second set of CGs.
-
公开(公告)号:US11841781B2
公开(公告)日:2023-12-12
申请号:US18066775
申请日:2022-12-15
Applicant: NetApp, Inc.
Inventor: Akhil Kaushik , Anoop Vijayan , Omprakash Khandelwal
CPC classification number: G06F11/2094 , G06F3/065 , G06F3/067 , G06F3/0619 , G06F3/0644 , G06F2201/82
Abstract: Systems and methods are described for a non-disruptive planned failover from a primary copy of data at a primary storage system to a mirror copy of the data at a cross-site secondary storage system. According to an example, a planned failover feature of a multi-site distributed storage system provides an order of operations such that a primary copy of a first data center continues to serve I/O operations until a mirror copy of a second data center is ready. This planned failover feature improves functionality and efficiency of the distributed storage system by providing non-disruptiveness during planned failover—even if various failures occur. The planned failover feature also includes a persistent fence to avoid serving I/O operations during a timing window when both primary data storage and secondary data storage are attempting to have a master role to serve I/O operations and this avoids a split-brain situation.
-
公开(公告)号:US11704207B2
公开(公告)日:2023-07-18
申请号:US17881381
申请日:2022-08-04
Applicant: NetApp, Inc.
Inventor: Akhil Kaushik , Anoop Vijayan
CPC classification number: G06F11/2069 , G06F11/1466 , G06F11/1469 , G06F11/3034
Abstract: Systems and methods are described for a non-disruptive planned failover from a primary copy of data at a primary storage cluster to a mirror copy of the data at a cross-site secondary storage cluster without using an external mediator. According to an example, a planned failover feature of a multi-site distributed storage system provides an order of operations such that a primary copy of a first data center continues to serve I/O operations until a mirror copy of a second data center is ready. This planned failover feature improves functionality and efficiency of the distributed storage system by providing non-disruptiveness during planned failover without using an external mediator based on a primary storage cluster being selected as an authority to implement a state machine with a persistent configuration database to track a planned failover state for the planned failover.
-
公开(公告)号:US11537314B1
公开(公告)日:2022-12-27
申请号:US17495990
申请日:2021-10-07
Applicant: NetApp, Inc.
Inventor: Murali Subramanian , Akhil Kaushik , Anoop Vijayan , Arun Kumar Selvam
IPC: G06F3/06
Abstract: Systems and methods are provided for bringing a volume of a consistency group (CG) into an in-synchronization (InSync) state while other volumes of the CG remain in the InSync state. According to an example, in order to support recovery from disruptive events in a manner that ensures a zero recovery point objective (RPO) guarantee and insulates an application making use of the CG from adverse impacts, responsive to a triggering event, a Fast Resync process may first be attempted to promptly bring an affected volume back into an in-synchronization (InSync) state from an out of synchronization (OOS) state while allowing other members of the CG to remain in the InSync state. Should the Fast resync process be unsuccessful in bringing the volume back into the InSync state within a predetermined or configurable time threshold, then a second type of resynchronization process may be employed at the CG level.
-
公开(公告)号:US20220318107A1
公开(公告)日:2022-10-06
申请号:US17219812
申请日:2021-03-31
Applicant: NetApp, Inc.
Inventor: Akhil Kaushik , Anoop Vijayan , Omprakash Khandelwal
Abstract: Systems and methods are described for a non-disruptive planned failover from a primary copy of data at a primary storage system to a mirror copy of the data at a cross-site secondary storage system. According to an example, a planned failover feature of a multi-site distributed storage system provides an order of operations such that a primary copy of a first data center continues to serve I/O operations until a mirror copy of a second data center is ready. This planned failover feature improves functionality and efficiency of the distributed storage system by providing non-disruptiveness during planned failover—even if various failures occur. The planned failover feature also includes a persistent fence to avoid serving I/O operations during a timing window when both primary data storage and secondary data storage are attempting to have a master role to serve I/O operations and this avoids a split-brain situation.
-
公开(公告)号:US20250165359A1
公开(公告)日:2025-05-22
申请号:US19033913
申请日:2025-01-22
Applicant: NetApp, Inc.
Inventor: Anoop Vijayan , Akhil Kaushik , Dhruvil Shah
Abstract: Multi-site distributed storage systems and computer-implemented methods are described for improving a resumption time of input/output (I/O) operations during an automatic unplanned failover (AUFO). A computer-implemented method includes monitoring, with a second cluster, heartbeat information received at ultra-short time intervals from a first connection of one or more storage objects of the first cluster, determining, with the second cluster, whether the heartbeat information from the first connection is received during an ultra-short time interval, and intelligently routing heartbeat information from the one or more storage objects of the first cluster from the first connection to a second connection when the heartbeat information from the first connection is not received during the ultra-short time interval.
-
公开(公告)号:US20250053488A1
公开(公告)日:2025-02-13
申请号:US18928972
申请日:2024-10-28
Applicant: NetApp, Inc.
Inventor: Anoop Vijayan , Akhil Kaushik , Sohan Shetty , Dhruvil Shah
IPC: G06F11/20
Abstract: Multi-site distributed storage systems and computer-implemented methods are described for improving a resumption time of input/output (I/O) operations during an automatic unplanned failover (AUFO). A computer-implemented method includes determining, with a second storage cluster, whether heartbeat information from one or more storage objects of a CG of a first set of CGs is received during a time period, determining an out of sync state for a data replication relationship between the CG of the first set of CGs and a mirrored CG of a second set of CGs when the heartbeat information is not received during the time period and sending a single bulk role change call with a cluster identifier from the second cluster to an external mediator to provide a role change from follower to leader in the second set of CGs.
-
28.
公开(公告)号:US20240338145A1
公开(公告)日:2024-10-10
申请号:US18296834
申请日:2023-04-06
Applicant: NetApp, Inc.
Inventor: Sohan Shetty , Anoop Vijayan , Akhil Kaushik , Rohit Chaudhary
IPC: G06F3/06
CPC classification number: G06F3/0655 , G06F3/0604 , G06F3/067
Abstract: According to an example, a computer-implemented method comprises initiating a first process for atomically setting the primary bias state with a first node of a primary storage cluster of a multi-site distributed storage system due to a temporary loss of connectivity to a mediator or a temporary mediator failure, releasing an atomic lock for the first process on the first node of the primary storage cluster, sending the first process and an associated first generation indicator to a first node of a secondary storage cluster of the multi-site distributed storage system to handle the first process for setting the primary bias state, and initiating a second process for atomically clearing a primary bias state with the first node or any node of the primary storage cluster based on detecting a connection to the mediator or detecting that the mediator is available.
-
公开(公告)号:US20240329843A1
公开(公告)日:2024-10-03
申请号:US18672604
申请日:2024-05-23
Applicant: NetApp, Inc.
Inventor: Anoop Vijayan , Akhil Kaushik , Sohan Shetty , Dhruvil Shah
IPC: G06F3/06
CPC classification number: G06F3/0611 , G06F3/0614 , G06F3/0655 , G06F3/067
Abstract: Multi-site distributed storage systems and computer-implemented methods are described for improving a resumption time for processing of input/output (I/O) operations during an automatic unplanned failover (AUFO). A first storage cluster includes a first set of consistency groups (CGs) and a second storage cluster includes a second mirrored set of CGs. A computer-implemented method includes prefetching, with a user space of the second storage cluster, configuration information from a replicated database prior to starting the AUFO workflow, sending the configuration information to a kernel space of the second storage cluster on a per CG level while queuing the AUFO workflow, and determining if any in progress workflows conflict with the AUFO workflow.
-
公开(公告)号:US11941267B2
公开(公告)日:2024-03-26
申请号:US18360133
申请日:2023-07-27
Applicant: NetApp, Inc.
Inventor: Arul Valan , Anoop Vijayan , Akhil Kaushik
CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/067
Abstract: Systems and methods for making a cross-site storage solution resilient towards mediator unavailability are provided. According to one embodiment, a stretched storage system is operable to bring a mediator associated with a primary and secondary distributed storage system back into the role of an arbitrator for peered consistency groups (CGs). A mediator reseed status indicator is maintained for multiple CGs to identify when the mediator's status information for a CG is stale. When the mediator becomes available and a local CG is identified as the subject of a mediator reseed process, the master node of the primary that hosts a master copy of a dataset for the local CG performs the reseed process, including: (i) causing relationship status information for the local CG to be updated on the mediator to the current state maintained by the primary; and (ii) resetting the mediator reseed status indicator.
-
-
-
-
-
-
-
-
-