Invention Grant
- Patent Title: System and method for duplicate text recognition
- Patent Title (中): 重复文本识别的系统和方法
-
Application No.: US12619690Application Date: 2009-11-17
-
Publication No.: US08577155B2Publication Date: 2013-11-05
- Inventor: Tat Ming Damein Wu , Ka Yeung Sin
- Applicant: Tat Ming Damein Wu , Ka Yeung Sin
- Applicant Address: HK Hong Kong
- Assignee: Wisers Information Limited
- Current Assignee: Wisers Information Limited
- Current Assignee Address: HK Hong Kong
- Priority: CN200910134840 20090407
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06K9/36 ; G06F7/00 ; G06F17/00

Abstract:
A system for duplicate text recognition includes a first means for dividing an electronic text into a plurality of phrase segments; a second means for converting each of the phrase segments into a unique and fixed-length bit string; a third means for storing a plurality of groups of the bit strings, each group of bit strings (string group) including a plurality of bit strings respectively corresponding to the phrase segments in a particular electronic text; and a fourth means for determining whether a predefined similarity between any two string groups in the third means reaches a first threshold, and for determining the two electronic texts corresponding to the two string groups are duplicate texts if the predefined similarity between the two string groups reaches the first threshold.
Public/Granted literature
- US20100254613A1 SYSTEM AND METHOD FOR DUPLICATE TEXT RECOGNITION Public/Granted day:2010-10-07
Information query