Invention Grant
US08527475B1 System and method for identifying structured data items lacking requisite information for rule-based duplicate detection
有权
用于识别缺少基于规则的重复检测所需信息的结构化数据项的系统和方法
- Patent Title: System and method for identifying structured data items lacking requisite information for rule-based duplicate detection
- Patent Title (中): 用于识别缺少基于规则的重复检测所需信息的结构化数据项的系统和方法
-
Application No.: US13239068Application Date: 2011-09-21
-
Publication No.: US08527475B1Publication Date: 2013-09-03
- Inventor: Roshan Ram Rammohan , Madhu M Kurup , Srikanth Thirumalai
- Applicant: Roshan Ram Rammohan , Madhu M Kurup , Srikanth Thirumalai
- Applicant Address: US NV Reno
- Assignee: Amazon Technologies, Inc.
- Current Assignee: Amazon Technologies, Inc.
- Current Assignee Address: US NV Reno
- Agency: Meyertons, Hood, Kivlin, Kowert & Goetzel, P.C.
- Agent Robert C. Kowert
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Embodiments of a system and method for identifying structured data items lacking requisite information for rule-based duplicate detection are described. Embodiments may include generating a deficiency score for each of multiple structured data items including applying a set of rules based on duplicate detection techniques to each given structured data item in order to perform a comparison of the given structured data item to itself. The deficiency score of the given structured data item may be based on a result of the comparison. Embodiments may also include, based on the deficiency scores of the structured data items, identifying one or more deficient structured data items having less than a requisite quantity of information for performing duplicate detection on structured data items. Embodiments may also include identifying one or more key attributes missing from some of the one or more deficient structured data items and requesting those key attributes.
Information query