Invention Grant
US07836054B2 System and method for processing a message store for near duplicate messages
有权
用于处理近乎重复的消息的消息存储的系统和方法
- Patent Title: System and method for processing a message store for near duplicate messages
- Patent Title (中): 用于处理近乎重复的消息的消息存储的系统和方法
-
Application No.: US12542581Application Date: 2009-08-17
-
Publication No.: US07836054B2Publication Date: 2010-11-16
- Inventor: Kenji Kawai , David T. McDonald
- Applicant: Kenji Kawai , David T. McDonald
- Applicant Address: US MD Baltimore
- Assignee: FTI Technology LLC
- Current Assignee: FTI Technology LLC
- Current Assignee Address: US MD Baltimore
- Agent Patrick J. S. Inouye; Scott E. Smith
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30

Abstract:
A system and method for processing a message store for near duplicate messages is provided. Metadata, content, and each attachment associated with messages are extracted. Near duplicate messages in the message store are identified. Compound digests taken of the metadata for, of the content contained in, and of the each attachment associated with each of the messages in the message store are compared. Each message having a compound digest not matching the compound digest of any other message is marked as unique and each message having a compound digest matching the compound digest of at least one other message is marked as an exact duplicate. Messages remaining unmarked and having similar content are grouped into sets that each includes one or more near duplicate messages. One of the near duplicate messages is designated as unique and each remaining near duplicate message in the set is designated as a near duplicate.
Public/Granted literature
- US20090307630A1 System And Method for Processing A Message Store For Near Duplicate Messages Public/Granted day:2009-12-10
Information query