Abstract:
An optimized segment cleaning technique is configured to efficiently clean one or more selected portions or segments of a storage array coupled to one or more nodes of a cluster. A bottom-up approach of the segment cleaning technique is configured to read all blocks of a segment to be cleaned (i.e., an "old" segment) to locate extents stored on the SSDs of the old segment and examine extent metadata to determine whether the extents are valid and, if so, relocate the valid extents to a segment being written (i.e., a "new" segment). A top-down approach of the segment cleaning technique obviates reading of the blocks of the old segment to locate the extents and, instead, examines the extent metadata to determine the valid extents of the old segment. A hybrid approach may extend the top-down approach to include only full stripe read operations needed for relocation and reconstruction of blocks as well as retrieval of valid extents from the stripes, while also avoiding any unnecessary read operations of the bottom-down approach.
Abstract:
Data consistency and availability can be provided at the granularity of logical storage objects in storage solutions that use storage virtualization in clustered storage environments. To ensure consistency of data across different storage elements, synchronization is performed across the different storage elements. Changes to data are synchronized across storage elements in different clusters by propagating the changes from a primary logical storage object to a secondary logical storage object. To satisfy the strictest RPOs while maintaining performance, change requests are intercepted prior to being sent to a filesystem that hosts the primary logical storage object and propagated to a different managing storage element associated with the secondary logical storage object.
Abstract:
A first request to execute a first task is received from a first module in a first address space and by a second module in a second address space. The first task is placed into a task queue for execution in the second address space. Pending responses not yet returned to the first module that are results of execution for other tasks in the second address space are extracted by the second module from a response queue. Requests for the other tasks were previously sent by the first module to the second module for execution in the second address space. The pending responses are compounded. The pending responses and a return value for acknowledgement the first request to execute the first task are combined, by the second module into a combined communication. The combined communication is transmitted by the second module to the first module in the first address space.
Abstract:
A method, non-transitory computer readable medium, and apparatus that monitors an active virtual storage controller. A determination of when a failure of the active virtual storage controller has occurred is made based on the monitoring. When the failure of the active virtual storage controller is determined to have occurred, storage devices previously assigned to the active virtual storage controller are remapped to a passive virtual storage controller and transactions in a transaction log are replayed. In another example, active storage controllers are monitored with a passive storage controller. When a failure of one of the active storage controllers has occurred based on the monitoring is determined, storage devices previously assigned to the active storage controller are remapped, a transaction log associated with the active storage controller is retrieved from a transaction log database, and transactions in the transaction log are replayed.
Abstract:
A system and method for connectivity-aware assignment of volumes among the storage controllers of a storage system is provided. In some embodiments, during a discovery phase, a connectivity metric is determined from a device discovery command. The connectivity metric is recorded into a data structure that identifies a plurality of hosts and a plurality of storage controllers of a storage system. In response to the determining of the connectivity metric, a storage controller ownership of a first volume is changed to improve connectivity between a host of the plurality of hosts and the first volume. In some such embodiments, a storage controller ownership of a second volume is changed to balance load among the plurality of storage controllers, and the discovery phase is, in part, a response to the change in the storage controller ownership of the second volume.
Abstract:
A system and method for connectivity-aware assignment of volumes among the storage controllers of a storage system is provided. In some embodiments, during a discovery phase, a connectivity metric is determined from a device discovery command. The connectivity metric is recorded into a data structure that identifies a plurality of hosts and a plurality of storage controllers of a storage system. In response to the determining of the connectivity metric, a storage controller ownership of a first volume is changed to improve connectivity between a host of the plurality of hosts and the first volume. In some such embodiments, a storage controller ownership of a second volume is changed to balance load among the plurality of storage controllers, and the discovery phase is, in part, a response to the change in the storage controller ownership of the second volume.
Abstract:
A system and method for data replication is described. A destination storage system receives a message from a source storage system as part of a replication process. The message includes an identity of a first file, information about where the first file is stored in the source storage system, a name of a first data being used by the first file and stored at a first location of the source storage system, and a fingerprint of the first data. The destination storage system determines that a mapping database is unavailable or inaccurate, and accesses a fingerprint database using the fingerprint of the first data received with the message to determine whether data stored in the destination storage system has a fingerprint identical to the fingerprint of the first data.
Abstract:
Embodiments described herein are directed to a file system driven RAID rebuild technique. A layered file system may organize storage of data as segments spanning one or more sets of storage devices, such as solid state drives (SSDs), of a storage array, wherein each set of SSDs may form a RAID group configured to provide data redundancy for a segment. The file system may then drive (i.e., initiate) rebuild of a RAID configuration of the SSDs on a segment-by-segment basis in response to cleaning of the segment (i.e., segment cleaning). Each segment may include one or more RAID stripes that provide a level of data redundancy (e.g., single parity RAID 5 or double parity RAID 6) as well as RAID organization (i.e., distribution of data and parity) for the segment. Notably, the level of data redundancy and RAID organization may differ among the segments of the array.
Abstract:
A method, non-transitory computer readable medium and programmed apparatus that receives a request to replicate a volume from a source to a destination. The volume includes data and metadata including information descriptive of the data. The method includes determining a first set of blocks and a second set of blocks associated with the source, where the first set of blocks is associated with the metadata, and where the second set of blocks is associated with the data. The method includes initiating, based on the first set of blocks, replication of the volume from the source to the destination to generate a replicated volume at the destination. The replicated volume includes replicated metadata generated based on the replicated first set of blocks and includes absent allocated data corresponding to the data included in the volume storage at the source storage system.
Abstract:
Technology is disclosed for managing data in a distributed processing system ("the technology"). In various embodiments, the technology pushes "cold" data from a primary storage of the distributed processing system to a backup storage thereby maximizing the usage of the space on the primary storage to store "hot" data on which most data processing activities are performed in the distributed processing system. The cold data is retrieved from the backup storage into the primary storage on demand, for example, upon receiving an access request from a client. While the primary storage stores the data in a format specific to the distributed processing system, the backup storage stores the data in a different format, for example, format corresponding to the type of backup storage.