Invention Grant
US08626767B2 Computer-implemented system and method for identifying near duplicate messages
有权
用于识别近重复消息的计算机实现的系统和方法
- Patent Title: Computer-implemented system and method for identifying near duplicate messages
- Patent Title (中): 用于识别近重复消息的计算机实现的系统和方法
-
Application No.: US13909065Application Date: 2013-06-03
-
Publication No.: US08626767B2Publication Date: 2014-01-07
- Inventor: Kenji Kawai , David T. McDonald
- Applicant: FTI Technology LLC
- Applicant Address: US MD Annapolis
- Assignee: FTI Technology LLC
- Current Assignee: FTI Technology LLC
- Current Assignee Address: US MD Annapolis
- Agent Patrick J. S. Inouye; Krista A. Wittman
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30 ; G06F11/14

Abstract:
A computer-implemented system and method for identifying near duplicate messages is provided. Messages each including a content body are grouped by conversation thread. One or more of the messages also includes an attachment. The messages for each conversation thread are sorted in order of message length. At least one of the messages is selected from one of the threads and the body of the selected message is compared with the body of one such shorter message in that thread. A determination is made that the body of the shorter message is included in the body of the selected message. Hash codes of the attachments for the selected message and the shorter message are compared. The shorter message is marked as a near duplicate message of the selected message when the hash codes of the attachments match.
Public/Granted literature
- US20130268610A1 Computer-Implemented System And Method For Identifying Near Duplicate Messages Public/Granted day:2013-10-10
Information query