Unsupervised corpus expansion using domain-specific terms
Abstract:
In an approach to unsupervised corpus expansion using domain-specific terms, one or more computer processors retrieve one or more domain-specific terms from a corpus of text. One or more computer processors search the World Wide Web for the one or more domain-specific terms to produce a plurality of web pages associated with each of the one or more domain-specific terms. One or more computer processors determine a confidence score for each of the plurality of web pages. One or more computer processors determine the confidence score of at least one of the plurality of web pages exceeds a pre-defined threshold. One or more computer processors add the at least one of the plurality of web pages to the corpus of text.
Public/Granted literature
Information query
Patent Agency Ranking
0/0