Abstract:
A method for adjustable error correction in a storage cluster is provided. The method includes determining health of a non-volatile memory of a non-volatile solid-state storage unit of each of a plurality of storage nodes in a storage cluster on a basis of per flash package, per flash die, per flash plane, per flash block, or per flash page. The determining is performed by the storage cluster. The plurality of storage nodes is housed within a chassis that couples the storage nodes as the storage cluster. The method includes adjusting erasure coding across the plurality of storage nodes based on the health of the non-volatile memory and distributing user data throughout the plurality of storage nodes through the erasure coding. The user data is accessible via the erasure coding from a remainder of the plurality of storage nodes if any of the plurality of storage nodes are unreachable.
Abstract:
A method of failure mapping is provided. The method includes distributing user data throughout a plurality of storage nodes through erasure coding, wherein the plurality of storage nodes are housed within a chassis that couples the storage nodes as a storage cluster. Each of the plurality of storage nodes has a non-volatile solid-state storage with flash memory or other types of non-volatile memory and the user data is accessible via the erasure coding from a remainder of the plurality of storage nodes in event of two of the plurality of storage nodes being unreachable. The method includes determining that a non-volatile memory block in the memory has a defect and generating a mask that indicates the non-volatile memory block and the defect. The method includes reading from the non-volatile memory block with application of the mask, wherein the reading and the application of the mask are performed by the non-volatile solid-state storage.
Abstract:
A storage system with storage drives and a processing device establishes resiliency groups of storage system resources. The storage system determines an explicit trade-off between data survivability over resource failures and data capacity efficiency, for the resiliency groups. Responsive to adding at least one storage drive, the storage system establishes re-formed resiliency groups according to the explicit trade-off, without decreasing data survivability. The storage system may bias to have more and narrower resiliency groups to increase mean time to data loss.
Abstract:
A method for adjustable error correction in a storage cluster is provided. The method includes determining health of a non-volatile memory of a non-volatile solid-state storage unit of each of a plurality of storage nodes in a storage cluster on a basis of per flash package, per flash die, per flash plane, per flash block, or per flash page. The determining is performed by the storage cluster. The plurality of storage nodes is housed within a chassis that couples the storage nodes as the storage cluster. The method includes adjusting erasure coding across the plurality of storage nodes based on the health of the non-volatile memory and distributing user data throughout the plurality of storage nodes through the erasure coding. The user data is accessible via the erasure coding from a remainder of the plurality of storage nodes if any of the plurality of storage nodes are unreachable.
Abstract:
A method for page writes for triple or higher level cell flash memory is provided. The method includes receiving data in a storage system, from a client that is agnostic of page write requirements for triple or higher level cell flash memory, wherein the page write requirements specify an amount of data and a sequence of writing data for a set of pages to assure read data coherency for the set of pages. The method includes accumulating the received data, in random-access memory (RAM) in the storage system to satisfy the page write requirements for the triple or higher level cell flash memory in the storage system. The method includes writing at least a portion of the accumulated data in accordance with the page write requirements, from the RAM to the triple level cell, or the higher level cell, flash memory in the storage system as an atomic write.
Abstract:
A storage system that supports independent scaling of compute resources and storage resources, the storage system including: one or more chassis, wherein each chassis includes a plurality of slots, each slot configured to receive a blade; a plurality of compute resources; a plurality of storage resources; a plurality of blades, where each blade includes at least one compute resource or at least one storage resource and each of the storage resources may be directly accessed by each of the compute resources without utilizing an intermediate compute resource; a first power domain configured to deliver power to one or more of the compute resources; and a second power domain configured to deliver power to the storage resources, wherein the first power domain and the second power domain can be independently operated.
Abstract:
A first set of physical units of a storage device of a storage system is selected for performance of low latency access operations, wherein other access operations are performed by remaining physical units of the storage device. A determination as to whether a triggering event has occurred that causes a selection of a new set of physical units of the storage device for the performance of low latency access operations is made. A second set of physical units of the storage device is selected for the performance of low latency access operations upon determining that the triggering event has occurred.
Abstract:
A storage system has a resiliency scheme to enhance storage system performance. The storage system composes a RAID stripe. The storage system mixes an ordering of portions of the RAID stripe, based on reliability differences across zones of the solid-state memory. Each zone of the solid state memory corresponds to a type of solid state memory. The storage system writes the mixed ordering RAID stripe across the solid-state memory.
Abstract:
A storage system has a first memory, a second memory that include solid-state storage memory, and a processing device. The processing device is to select a mode for each portion of data to be written. Selection of the mode is based at least on size of the portion of data. Selection of the mode is from among modes that include a first mode of writing the portion of data in mirrored RAID form to the first memory for later transfer from the first memory to the second memory, a second mode of writing the portion of data in parity-based RAID form to the first memory for later transfer from the first memory to the second memory, and a third mode of writing the portion of data to the second memory, bypassing the first memory. The processing device is to handle portions of data to be written according to such selection.
Abstract:
A storage system determines a size of a portion of data to be written as a RAID stripe across storage devices. The storage system determines aspects of the RAID stripe. Aspects of the RAID stripe include a data segment size for shards of the RAID stripe, a type of RAID, a width of the RAID stripe, a level of redundancy of the RAID stripe, and a selection of members of the storage devices. All of the determining for the aspects of the RAID stripe are on a dynamic basis and based at least on the size of the portion of data. The storage system writes the portion of data according to the determined aspects of the RAID stripe across the selected members of the storage devices.