Invention Grant
- Patent Title: Dynamic record blocking
- Patent Title (中): 动态记录封锁
-
Application No.: US13349414Application Date: 2012-01-12
-
Publication No.: US08645399B2Publication Date: 2014-02-04
- Inventor: William P. McNeill , Andrew Borthwick
- Applicant: William P. McNeill , Andrew Borthwick
- Applicant Address: US WA Bellevue
- Assignee: Intelius Inc.
- Current Assignee: Intelius Inc.
- Current Assignee Address: US WA Bellevue
- Agency: Nixon & Vanderhye P.C.
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Dynamic blocking determines which pairs of records in a data set should be examined as potential duplicates. Records are grouped together into blocks by shared properties that are indicators of duplication. Blocks that are too large to be efficiently processed are further subdivided by other properties chosen in a data-driven way. We demonstrate the viability of this algorithm for large data sets. We have scaled this system up to work on billions of records on an 80 node Hadoop cluster.
Public/Granted literature
- US20130173560A1 DYNAMIC RECORD BLOCKING Public/Granted day:2013-07-04
Information query