Abstract:
A method, a computing device, and a non-transitory machine-readable medium for assessing data segments for garbage collection is provided. In some embodiments, the method includes identifying a plurality of data segments. A first rate at which data within each of the plurality of data segments has been invalidated since a first point in time is determined, and a second rate at which data within each of the plurality of data segments has been invalidated since a second point in time subsequent to the first point in time is determined. The second rate is compared to the first rate for each of the plurality of data segments, and a garbage collection score is assigned to the respective data segment based on the comparison. The garbage collection score may be further based on a utilization of the respective data segment and/or an age of the respective data segment.
Abstract:
A system and method for improving storage system performance by reducing or avoiding load spike amplification when performing garbage collection is disclosed. A storage controller in a storage system tracks system load including write load and read load, as well as available free segments. The storage controller uses these tracked values as inputs and, with these inputs, generates a garbage collection rate. Where read load is included, a scaled portion of the read load is taken into consideration so that, as the number of free segments nears the minimum amount desired and to prevent garbage collecting too slowly, the read load is gradually excluded from the garbage collection rate determination. The garbage collection rate is therefore responsive to system load so that, in times of high system load, the rate reduces as much as is safe so that the write load takes priority with computing resources of the storage controller.
Abstract:
Disclosed are systems, computer-readable mediums, and methods for transforming data in a file system. As part of a recycling process, a determination is made that transformations should be attempted. A data block is determined to be in use by at least one user of the storage system. If a transformation should be attempted on the data block is determined. Parameters associated with the performance of the file system can be used in this determination. A type of transformation to be done is determined. The data block is transformed based upon the selected transformation. The transformed data block is written to the storage system. As part of the recycling process, the transformation requires no additional input/output requests.
Abstract:
One or more techniques and/or computing devices are provided for directory level incremental replication. For example, a first storage controller may evaluate a base snapshot and an incremental snapshot of a source subdirectory to generate a set of operations that can be used by a second storage controller for reconstructing a mirror of the source subdirectory as reflected by the incremental snapshot. Accordingly, the first storage controller may send the set of operations and/or source data to the second storage controller for constructing a destination directory structure mirroring the source subdirectory. In this way, replication may be achieved at an arbitrary level of granularity, such as to replicate a particular subdirectory of a volume.
Abstract:
A system and method for recovering a dataset is provided that analyzes the dataset as it currently exists in order to determine those portions that do not need to be recovered. In some embodiments, the method includes identifying a dataset stored on a set of storage devices and corresponding to a first point in time. A request to restore the dataset to a second point in time is received, and a subset of the dataset is identified that is different between the first point in time and the second point in time. Data associated with the subset is selectively retrieved that corresponds to the second point in time, and the retrieved data is merged with the dataset stored on the set of storage devices. The two points in time may have any relationship, and in various examples, the method performs a roll-back or a roll-forward of the dataset.
Abstract:
One or more techniques and/or computing devices are provided for automatic switchover implementation. For example, a first storage controller, of a first storage cluster, may have a disaster recovery relationship with a second storage controller of a second storage cluster. In the event the first storage controller fails, the second storage controller may automatically switchover operation from the first storage controller to the second storage controller for providing clients with failover access to data previously accessible to the clients through the first storage controller. The second storage controller may detect, cross-cluster, a failure of the first storage controller utilizing remote direct memory access (RDMA) read operations to access heartbeat information, heartbeat information stored within a disk mailbox, and/or service processor traps. In this way, the second storage controller may efficiently detect failure of the first storage controller to trigger automatic switchover for non-disruptive client access to data.
Abstract:
Systems and techniques for cache management are disclosed that provide improved cache performance by prioritizing particular storage stripes for cache flush operations. The systems and techniques may also leverage features of the storage devices to provide atomicity without the overhead of inter-controller mirroring. In some embodiments, the systems and techniques include a storage controller that stores data in a cache. The data is associated with one or more sectors of a storage stripe that is defined over plurality of storage devices. The storage controller identifies a locality of dirty sectors of the one or more sectors, classifies the storage stripe into a category based on the locality, provides a category ordering of the category relative to at least one other category, and flushes the storage stripe from the cache to the plurality of storage devices according to the category ordering.
Abstract:
A method for migration of operations between CPU cores, the method includes: processing, by a source core, one or more tasks and one or more interrupt service routines; accessing a mapping corresponding to a task of the one or more tasks and an interrupt service routine of the one or more interrupt service routines; identifying, based on the mapping, a target core that corresponds to the task and the interrupt service routine; blocking the task from being processed by the source core in response to identifying the target core; in response to identifying the target core, disabling an interrupt corresponding to the interrupt service routine; in response to identifying the target core, assigning the task and the interrupt to the target core; after assigning the interrupt to the target core, enabling the interrupt; and after assigning the task to the target core, processing the task by the target core.
Abstract:
A system and method for recognizing data access patterns in large data sets and for preloading a cache based on the recognized patterns is provided. In some embodiments, the method includes receiving a data transaction directed to an address space and recording the data transaction in a first set of counters and in a second set of counters. The first set of counters divides the address space into address ranges of a first size, whereas the second set of counters divides the address space into address ranges of a second size that is different from the first size. One of a storage device or a cache thereof is selected to service the data transaction based on the first set of counters, and data is preloaded into the cache based on the second set of counters.
Abstract:
A system and method for handling multi-node failures in a disaster recovery cluster is provided. In the event of an error condition, a switchover operation occurs from the failed nodes to one or more surviving nodes. Data stored in non-volatile random access memory is recovered by the surviving nodes to bring storage objects, e.g., disks, aggregates and/or volumes into a consistent state.