Descriptor uniqueness for entity clustering
Abstract:
A mechanism is provided in a data processing system to implement a cognitive natural language processing (NLP) system with descriptor uniqueness identification to support named entity mention clustering. The mechanism annotates a set of documents from a corpus of documents for entity types and mentions, collects descriptor usages from all documents in the corpus of documents, analyzes the descriptor usages to classify the descriptors as base terms or modifier terms, generates compatibility scores for the descriptors, and performs entity merging of entity clusters based on the compatibility scores.
Public/Granted literature
Information query
Patent Agency Ranking
0/0