Abstract:
Systems and methods for reducing metadata in a write-anywhere storage system are disclosed herein. The system includes a plurality of clients coupled with a plurality of storage nodes, each storage node having a plurality of primary storage devices coupled thereto. A memory management unit including cache memory is included in the client. The memory management unit serves as a cache for data produced by the clients before the data is stored in the primary storage. The cache includes an extent cache, an extent index, a commit cache and a commit index. The movement of data and metadata is by an interval tree. Methods for reducing data in the interval tree increase data storage and data retrieval performance of the system.
Abstract:
A resilient distributed replicated data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, it is partitioned into a plurality of data objects and a plurality of parity objects are calculated. Reassembly instructions are created for the data item. The data objects, parity objects and reassembly instructions are spread across nodes and zones in the storage system according to a policy for the data item. When a zone is inaccessible, a virtual zone is created and used until the intended zone is available. When a read request is received, the data item is prepared from the lowest latency nodes according to the reassembly instructions, and a virtual zone is accessed in place of a real zone when the real zone is inaccessible.
Abstract:
A failure resilient distributed replicated data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, it is partitioned into a plurality of data objects and a plurality of parity objects calculated. Reassembly instructions are created for the data item. The data objects and parity objects are spread across all nodes and zones in the storage system. Reassembly instructions are also spread across the zones. When a read request is received, the data item is prepared from the lowest latency nodes according to the reassembly instructions. This provides for data resiliency while keeping the amount of storage space required relatively low.
Abstract:
Systems and methods for reducing metadata in a write-anywhere storage system are disclosed herein. The system includes a plurality of clients coupled with a plurality of storage nodes, each storage node having a plurality of primary storage devices coupled thereto. A memory management unit including cache memory is included in the client. The memory management unit serves as a cache for data produced by the clients before the data is stored in the primary storage. The cache includes an extent cache, an extent index, a commit cache and a commit index. The movement of data and metadata is by an interval tree. Methods for reducing data in the interval tree increase data storage and data retrieval performance of the system.
Abstract:
A searchable data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, a local database is updated with information about the newly stored data item. When a search for a data item meeting certain metadata criteria is received, multiple concurrent searches are conducted across all storage devices in all nodes in all zones of the storage system. The configuration of the data storage system allows a parallel concurrent search at constituent storage devices to be performed quickly.
Abstract:
A resilient distributed replicated data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, it is partitioned into a plurality of data objects and a plurality of parity objects are calculated. Reassembly instructions are created for the data item. The data objects, parity objects and reassembly instructions are spread across nodes and zones in the storage system according to a policy for the data item. When a zone is inaccessible, a virtual zone is created and used until the intended zone is available. When a read request is received, the data item is prepared from the lowest latency nodes according to the reassembly instructions, and a virtual zone is accessed in place of a real zone when the real zone is inaccessible.
Abstract:
A resilient distributed replicated data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, it is partitioned into a plurality of data objects and a plurality of parity objects are calculated. Reassembly instructions are created for the data item. The data objects, parity objects and reassembly instructions are spread across nodes and zones in the storage system according to a policy for the data item. When a zone is inaccessible, a virtual zone is created and used until the intended zone is available. When a read request is received, the data item is prepared from the lowest latency nodes according to the reassembly instructions, and a virtual zone is accessed in place of a real zone when the real zone is inaccessible.
Abstract:
Systems and methods for reducing metadata in a write-anywhere storage system are disclosed herein. The system includes a plurality of clients coupled with a plurality of storage nodes, each storage node having a plurality of primary storage devices coupled thereto. A memory management unit including cache memory is included in the client. The memory management unit serves as a cache for data produced by the clients before the data is stored in the primary storage. The cache includes an extent cache, an extent index, a commit cache and a commit index. The movement of data and metadata is by an interval tree. Methods for reducing data in the interval tree increase data storage and data retrieval performance of the system.
Abstract:
A data storage system allowing for ingest of data when certain storage is unavailable is described herein. The storage system includes zones that are independent and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When data is to be stored in the data storage system according to a specified storage policy and the specified storage policy cannot be achieved, the data is stored according to a fallback storage policy. This allows a client to be able to continue executing without having to wait for a storage anomaly to be corrected or pass. After the data is stored according to a fallback storage policy, the data is at a later time stored according to the specified storage policy.
Abstract:
Data storage systems and methods for storing data are described herein. The storage system includes a first storage node is configured to issue a first delivery request to a first set of other storage nodes in the storage system, the first delivery request including a first at least one data operation for each of the first set of other storage nodes and issuing at least one other delivery request, while the first delivery request remains outstanding, the at least one other delivery request including a first commit request for each of the first set of other storage nodes. The first node causes the first at least one data operation to be made active within the storage system in response to receipt of a commit indicator along with a delivery acknowledgement regarding one of the at least one other delivery request.