Abstract:
A semi-static distribution technique distributes parity across disks of an array. According to the technique, parity is distributed (assigned) across the disks of the array in a manner that maintains a fixed pattern of parity blocks among the stripes of the disks. When one or more disks are added to the array, the semi-static technique redistributes parity in a way that does not require recalculation of parity or moving of any data blocks. Notably, the parity information is not actually moved; the technique merely involves a change in the assignment (or reservation) for some of the parity blocks of each pre-existing disk to the newly added disk.
Abstract:
A uniform and symmetric, double failure-correcting technique protects against two or fewer disk failures in a disk array of a storage system. A RAID system of the storage system generates two disks worth of “redundant” information for storage in the array, wherein the redundant information (e.g., parity) is illustratively derived from computations along both diagonal parity sets (“diagonals”) and row parity sets (“rows”). Specifically, the RAID system computes row parity along rows of the array and diagonal parity along diagonals of the array. However, the contents of the redundant (parity) information disks interact such that neither disk contains purely (solely) diagonal or row redundancy information; the redundant information is generated using diagonal parity results in row parity computations (and vice versa).
Abstract:
The invention provides a method and system for recovery of file system data in file servers having mirrored file system volumes. The invention makes use of a 'snapshot' feature of a robust file system (the 'WAFL File System) to rapidly determine which of two or more mirrored volumes is most up-to-date, and which file blocks of the most recent mirrored volume have been changed from each one of the mirrored file systems. In a preferred embodiment, among a plurality of mirrored volumes, the invention rapidly determines which is the most up-to-date by examining a consistency point number maintained by the WAFL File System at each mirrored volume. The invention rapidly pairwise determines what blocks are shared between that most up-to-date mirrored volume and each other mirrored volume, in response to a snapshot of the file system maintained at each mirrored volume and are stored in common pairwise between each mirrored volume and the most up-to-date mirrored volume. The invention re synchronizes only those blocks that have been changed between the common snapshot and the most up-to-date snapshot.
Abstract:
On disk failure (210), the storage system migrates only those disk blocks that included allocated data, and treats unallocated disk blocks as being logically zero when possible. When there is no spare disk, the source disk block is logically set to zero and parity is recalculated for the RAID stripe associated with the source disk block (223). When there is a spare, unallocated blocks on the spare are logically or physically set to zero upon migration (222). Write operations for the failed disk are redirected to other non-failing disks, and a record of which in-use disk blocks have been thus 'migrated' to those other non-failing disks in maintained. Unused disk blocks are proactively set to zero. A target mirror copy is created using information regarding allocated disk blocks, by copying those blocks including allocated data or parity, and by clearing at the mirror those blocks not including any allocated.
Abstract:
The invention provides a method and system for performing XOR operations without consuming substantial computing resources. A specialised processor is coupled to the same bus as a set of disk drives; the specialized processor reviews data transfers to and from the disk drives and performs XOR operations on data transferred to and from the disk drives without requiring separate transfers. The specialised processor maintains an XOR accumulator which is used for XOR operations, which records the result of XOR operations, and which is read out upon command of the processor. The XOR accumulator includes one set of accumulator registers for each RAID stripe, for a selected set of RAID stripes. A memory (such as a contents-addressable memory) associates one set of accumulator registers with each selected RAID stripe.
Abstract:
An improved system and method enhances performance of updates to sequential block storage of a storage system. A disk-based sort procedure is provided to establish locality among updates (write data) held in a disk-based log, thereby enabling the write data to be efficiently written to home locations on a home location array. As the write data is received, a log manager of the storage system temporarily stores the data efficiently on the disk-based log. As more write data arrives, the log manager sorts the data in the log in accordance with the sort procedure, thus increasing the locality of data when stored on the home location array. When the log approaches capacity, the log manager writes the sorted data to their home locations on the array with high locality and performance.
Abstract:
A method and apparatus for a reliable data storage system using block level checksums appended to data blocks. Files are stored on hard disks in storage blocks, including data blocks and block-appended checksums. The block-appended checksum includes a checksum of the data block, a VBN, a DBN, and an embedded checksum for checking the integrity of the block-appended checksum itself. A file system includes file blocks with associated block-appended checksum to the data blocks. The file blocks with block-appended checksums are written to storage blocks. In a preferred embodiment a collection of disk drives are formatted with 520 bytes of data per sector. For each 4,096-byte file block, a corresponding 64-byte block-appended checksum is appended to the file block with the first 7 sectors including most of the file block data while the 8th sector includes the remaining file block data and the 64-byte block-appended checksum.
Abstract:
The invention provides a method and system for performing XOR operations without consuming substantial computing resources. A specialised processor is coupled to the same bus as a set of disk drives; the specialized processor reviews data transfers to and from the disk drives and performs XOR operations on data transferred to and from the disk drives without requiring separate transfers. The specialised processor maintains an XOR accumulator which is used for XOR operations, which records the result of XOR operations, and which is read out upon command of the processor. The XOR accumulator includes one set of accumulator registers for each RAID stripe, for a selected set of RAID stripes. A memory (such as a contents-addressable memory) associates one set of accumulator registers with each selected RAID stripe.