Data checksums without storage overhead
Abstract:
Disclosed herein is a computer-implemented method of including data characterising values of source data in redundant data, wherein there are K source nodes of source data and R redundant nodes of redundant data such that there are a plurality of N nodes, where N=(K+R), wherein each of the N nodes comprises a plurality of sub-blocks of data, wherein a block of data comprises N sub-blocks with each of the N sub-blocks comprised by a different one of the N nodes, such that each block comprises K sub-blocks of source data and R sub-blocks of redundant data, the method comprising: calculating K data characterising values in dependence on sub-blocks comprised by the source nodes, wherein each of the data characterising values is associated with a different one of the K source nodes, each of the K data characterising values is associated with a different block and each of the K data characterising values is calculated in dependence on all of the sub-blocks of the source node that the data characterising value is associated with except the one sub-block of the source node that is also comprised by the block that the data characterising value is associated with; and generating one or more sub-blocks of the source and redundant nodes in dependence on the K data characterising values. Advantages include one or more of improved determination of whether or not the stored data comprises errors, an increase in the number of errors that can detected and improved recovery from errors. By including data characteristics within the stored data rather than as metadata, the data characteristics do not increase the amount of metadata required.
Public/Granted literature
Information query
Patent Agency Ranking
0/0