Invention Grant
US08719267B2 Spectral neighborhood blocking for entity resolution 有权
光谱邻域阻塞用于实体分辨率

Spectral neighborhood blocking for entity resolution
Abstract:
A processing device of an information processing system is operative to obtain a plurality of records, documents, web pages or other data objects, and to construct a binary tree using a bipartition procedure in which subsets of the data objects are associated with respective nodes of the tree. Evaluation of a designated modularity for a given one of the nodes of the tree is used as a stopping criterion to prevent further partitioning of that node and to indicate designation of that node as a leaf node of the tree. The resulting leaf nodes of the tree provide a non-overlapping partitioning of the plurality of data objects. The processing device is further operative to perform a neighborhood search on the tree to identify pairs of the plurality of data objects that match the same entity, and to store an indication of the matching pairs of data objects.
Public/Granted literature
Information query
Patent Agency Ranking
0/0