Invention Grant
- Patent Title: Unsupervised information extraction dictionary creation
-
Application No.: US15342361Application Date: 2016-11-03
-
Publication No.: US10558747B2Publication Date: 2020-02-11
- Inventor: Sheng Hua Bao , Su Yan
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: ZIP Group PLLC
- Main IPC: G06F17/27
- IPC: G06F17/27 ; G06F16/33 ; G06F16/36 ; G06F16/332

Abstract:
A data handling system enables the unsupervised creation of an information extraction dictionary by expanding upon a word or phrase included within an expansion query. Prior to receiving the expansion query, the data handling system performs an unsupervised learning of an information corpus which includes text to assign a corpus vector to each word and phrase of the text. After the expansion query, the data handling system compares the expansion query to the corpus vectors. The data handling system ranks the corpus vectors by similarity to the expansion query and provides a ranked list of words or phrases associated with the ranked corpus vectors. The ranked list may be subsequently utilized as the information extraction dictionary.
Public/Granted literature
- US20180121444A1 UNSUPERVISED INFORMATION EXTRACTION DICTIONARY CREATION Public/Granted day:2018-05-03
Information query