Invention Grant
US08359472B1 Document fingerprinting with asymmetric selection of anchor points 有权
文档指纹与锚点的不对称选择

Document fingerprinting with asymmetric selection of anchor points
Abstract:
One embodiment relates to a computer-implemented process for generating document fingerprints. A document is normalized to create a normalized text string. A first hash function with a sliding hash window is applied to the normalized text string to generate an array of hash values. Candidate anchoring points are selected by applying a first filter to the array of hash values. The anchoring points are chosen by applying a second filter to the candidate anchoring points. Finally, a second hash function is applied to substrings located at the chosen anchoring points to generate hash values for use as fingerprints for the document. Other embodiments and aspects are also disclosed.
Information query
Patent Agency Ranking
0/0