Abstract:
The storage of data sets in a storage set (e.g., data sets written to hard disk drives comprising a RAID array) may diminish the performance of storage set through non-sequential writes, particularly if storage devices promptly write data sets that are followed by sequentially following data sets. Additionally, storage sets may exhibit inconsistencies due to non-atomic writes of data sets and verifiers (e.g., checksums) and an intervening failure, such as an occurrence of the RAID write hole. Instead, data sets and verifiers may first be written to a stored on the nonvolatile media of a storage device before being committed to storage set. Such writes may be sequentially written to the journal, irrespective of the locations of data sets in the storage set; and recovery of a failure may simply involve re-committing the consistent records in the journal to correct incomplete writes to storage set.
Abstract:
A system in which a file system may operate on a volume in which the logical address extent of the volume is divided into multiple tiers, each tier providing storage having a distinct trait set by mapping the logical addresses of the volume to appropriate underlying storage systems. A volume system exposes the volume to the file system in a manner that the file system itself has awareness of the tiers, and is aware of the trait sets of each tier. The file system may thus store file system namespaces (such as directories and files) into the tiers as appropriate for the file system namespace. A provisioning system may also be provided and be configured to provision the volume to include such tiers, and if desired, to extend the tiers.
Abstract:
The provisioning of a volume that has multiple tiers corresponding to different trait sets. The volume to be provisioned is identified along with multiple tiers that are to be in the volume. For each of the tiers that are to be provisioned within the volume, a corresponding trait set is identified as to be applied to each tier. This corresponding trait set may be based on underlying storage systems that are available at the time of provisioning, or which are anticipated to be available. The volume is then caused to be provisioned with the corresponding tiers having the corresponding trait sets. Also, the provisioning of a file, which is determined to have one or more storage traits. Based on these storage traits, the file is then caused to be assigned to an appropriate tier.
Abstract:
Resiliency techniques for a virtual disk are described that enable user control over storage efficiency and recovery time. Configuration parameters for a virtual disk are obtained that indicate a number of available storage devices and a specified tolerance for storage device failures. A default configuration for the virtual disk that designates a default amount of redundancy data to store with client data to balance storage efficiency and recovery time is derived based on the configuration parameters. Options may then be provided to specify a custom configuration that changes the amount of redundancy data to customize the level of storage efficiency and recovery time. The virtual disk is configured and data is stored thereon in accordance with the default configuration or the custom configuration as directed by the user.
Abstract:
A set of storage devices may interoperate to share a pool of storage space, such as in a Redundant Array of Inexpensive Disks (RAID) scheme. However, the details of the representation of the pool and allocation of capacity to the pool may enable advantages and/or impose limitations on the storage set. Presented herein are techniques for generating a representing a pooled partition on one or more storage devices featuring a pool configuration representing the pool as a set of spaces manifested by the pool; a set of storage devices sharing the pool; and a set of extents that map physical areas of the storage devices to logical areas of the spaces. The flexibility of these pooling techniques may enable such features as flexible capacity allocation, delayed binding, thin provisioning, and the participation of a storage device in two or more distinct pools shared with different sets of storage devices.
Abstract:
Aspects of the subject matter described herein relate to sharing volume data via shadow copies. In aspects, an active computer creates a shadow copy of a volume. The shadow copy is exposed to one or more passive computers that may read but not write to the volume. A passive computer may obtain data from the shadow copy by determining whether the data has been written to a differential area and, if so, reading it from the differential area. If the data has not been written to the differential area, the passive computer may obtain it by first reading it from the volume, then re-determining whether it has been written to the differential area, and if so, reading the data from the differential area. Otherwise, the data read from the volume corresponds to the data needed for the shadow copy.
Abstract:
Some implementations may include a virtual storage system to which data is written. The virtual storage system may include a cache and multiple hard drives. Multiple queues may be associated with the multiple hard drives such that each hard drive of the multiple hard drives has a corresponding queue of the multiple queues. A set of candidate rows may be selected from the cache. For each candidate row in the set of candidate rows, destination hard drives may be identified. Each candidate row may be placed in queues corresponding to the destination hard drives. Two or more candidate rows from the multiple queues may be written substantially contemporaneously (e.g., in parallel) to two or more destination hard drives.
Abstract:
Techniques for recovery and redistribution of data from a virtual disk storage system are described herein. In one or more implementations, a storage scheme derived for a virtual disk configuration is configured to implement various recovery and redistribution designed to improve recovery performance. The storage scheme implements one or more allocation techniques to produce substantially uniform or nearly uniform distributions of data across physical storage devices associated with a virtual disk. The allocation facilitates concurrent regeneration and rebalancing operations for recovery of data in the event of failures. Additionally, the storage scheme is configured to implements parallelization techniques to perform the concurrent operations including but not limited to controlling multiple parallel read/writes during recovery.
Abstract:
A thinly provisioned storage system detects whether physical storage capacity is available when there is a request to allocate storage capacity, prior to data being written to the storage system. In particular, at the time when the file system allocates storage, such as when creating a file or performing an extending write (append) operation, allocating storage to an unallocated region of a sparse file, defragmenting a file, and the like, a storage system can verify that actual physical storage capacity is available. Thus, if there is insufficient actual physical capacity at the time when a storage allocation is attempted, then an error message can be sent and remedial action can be taken.
Abstract:
The storage devices of a storage device set (e.g., a RAID array) may generate a nonvolatile representation of the configuration of the storage device set, including logical disks, spaces, storage pools, and layout and provisioning plans, on the physical media of the storage devices. A computer accessing the storage device set may also generate volatile memory representation of the storage device set to use while accessing the storage devices; however, the nonvolatile representation may not be performant due to its different usage and characteristics. Presented herein are techniques for accessing the storage device set according to volatile memory representation comprising a hierarchy of logical disks, slabs, and extents, and an accessor comprising a provisioning component that handles slab accesses while applying provisioning plans, and that interfaces with a lower-level layout component that translates slab accesses into storage device accesses while applying layout plans to the storage device set.