Abstract:
A data migration system supports a low-latency and reduced overhead data storage protocol for data storage sharing in a non-collision fashion which does not require inter-communication and permanent arbitration between data storage controllers to decide on the data placement/routing. The multiple data fragments of data sets are prevented from routing to the same storage devices by a multi-step selection protocol which selects (in a first phase of the selection routine) a healthy highest ranked drive enclosure, and further selects (in a second phase of the selection routine) a healthy highest-ranked data storage controller residing in the selected drive enclosure, for routing data fragments to different storage pools assigned to the selected data storage devices for exclusive “writing” and data modification. The selection protocol also contemplates various failure scenarios in a data placement collision free manner.
Abstract:
Systems and methods are provided for expanding the available memory of a storage controller. The systems and methods utilize a PCIe memory controller connected to the backend interface of the storage controller. Memory of the PCIe memory controller is memory mapped to controller memory of the storage controller. The PCIe connection allows the storage controller to access the memory of the PCIe memory controller with latencies similar to that of the controller memory.
Abstract:
Variable Redundancy Distributed (VRD) RAID controller in a data storage environment contains embedded RAID logic permitting to choose and compute a desired redundancy coding scheme from a plurality thereof pre-programmed and embedded in a Compute Engine in the VRD RAID controller. “Write” or “Read” requests which are received from data generating entities, contain information identifying a type of the redundancy coding scheme of interest. The controller decodes the request, and automatically applies the desired computation to the incoming data without burdening the CPU with the computational activity. The variable redundancy computational ability of the subject systems provides an extremely versatile and flexible tool for RAID operations.
Abstract:
Method and system for data migration between data generating entities and data storage devices protected by de-clustered RAID algorithm are enhanced by dynamically controlling the I/O activity towards the data storage devices (NVM devices) based on their remaining lifespan (health) with the goal to prevent multiple devices selected for writing a parity stripe information from simultaneous failures. This feature is rendered by polling the remaining health of NVM devices in the RAID pool, computing a weighted lifespan for each NVM device, comparing the latter to an average of all NVM devices in the pool, and adjusting the I/O activity towards the NVM device of interest accordingly. If the weighted lifespan exceeds the average lifespan in the pool, the allowed I/O activity is increased, and if the weighted lifespan is below the average for the pool, then the device in question is sent less “writes”.
Abstract:
Systems and methods for automated firmware update with rollback are described herein. The systems include a plurality of storage zones, each storage zone including a plurality of storage nodes, each storage node including a plurality of storage media. The method includes monitoring storage system activity and parameters and maintaining a data storage system usage and parameter database containing system activity information. When a firmware update is available, data storage system activity is evaluated. Storage nodes needing the firmware update are identified. The firmware update is run on available storage nodes identified as needing the firmware update. The impact of the firmware update is evaluated and a rollback of the firmware update is initiated on all firmware updated storage nodes when parameter variations are significant and/or result in degraded performance.
Abstract:
Systems and methods are provided for expanding the available memory of a storage controller. The systems and methods utilize a PCIe memory controller connected to the backend interface of the storage controller. Memory of the PCIe memory controller is memory mapped to controller memory of the storage controller. The PCIe connection allows the storage controller to access the memory of the PCIe memory controller with latencies similar to that of the controller memory.
Abstract:
A failure resilient distributed replicated data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, it is partitioned into a plurality of data objects and a plurality of parity objects calculated. Reassembly instructions are created for the data item. The data objects and parity objects are spread across all nodes and zones in the storage system. Reassembly instructions are also spread across the zones. When a read request is received, the data item is prepared from the lowest latency nodes according to the reassembly instructions. This provides for data resiliency while keeping the amount of storage space required relatively low.
Abstract:
Systems and methods for reducing metadata in a write-anywhere storage system are disclosed herein. The system includes a plurality of clients coupled with a plurality of storage nodes, each storage node having a plurality of primary storage devices coupled thereto. A memory management unit including cache memory is included in the client. The memory management unit serves as a cache for data produced by the clients before the data is stored in the primary storage. The cache includes an extent cache, an extent index, a commit cache and a commit index. The movement of data and metadata is by an interval tree. Methods for reducing data in the interval tree increase data storage and data retrieval performance of the system.
Abstract:
A searchable data storage system is described herein. The storage system includes zones that are independent, and autonomous from each other. The zones include nodes that are independent and autonomous. The nodes include storage devices. When a data item is stored, a local database is updated with information about the newly stored data item. When a search for a data item meeting certain metadata criteria is received, multiple concurrent searches are conducted across all storage devices in all nodes in all zones of the storage system. The configuration of the data storage system allows a parallel concurrent search at constituent storage devices to be performed quickly.
Abstract:
A data migrating system and method are provided in which a Burst Buffer Network Aggregator (BBNA) process is configured either on the File Servers or on the File System's dedicated I/O nodes to coalesce data fragments stored in participating Burst Buffer nodes under the direction of a primary BB node appointed by a data generating entity prior to transfer of the full data stripe into the File System. The “write” request in the form of a full data stripe is distributed into a plurality of data fragments among participating BB nodes along with corresponding metadata. The primary BB node gathers the metadata from the participating BB nodes, sends the metadata list to the BBNA unit, responsive to which the BBNA unit allocates a buffer sufficient to store the full data stripe, and transfers data fragments from participating BB nodes into the full data stripe buffer, thereby coalescing the data fragments into the full data stripe, which is subsequently transferred from the buffer in the BBNA unit into the File System.