Methods of determining a file similarity fingerprint
Abstract:
A similarity fingerprint for a data object such as a file can be automatically determined using one or more anchor values. The one or more anchor values can be provided or determined. For each anchor value, a set of distances between each instance of the anchor value in the data object is determined. The set of distances for the instance of the anchor value is aggregated into a single value. The single value is added as a component of the similarity fingerprint. Thus, if there are N anchor values, there can be N components of the similarity fingerprint. The similarity fingerprints of different data objects can be compared and the results of the comparison can be used to determine how similar the data objects are.
Public/Granted literature
Information query
Patent Agency Ranking
0/0