Invention Grant
- Patent Title: Automatically bootstrapping a domain-specific vocabulary
-
Application No.: US16559012Application Date: 2019-09-03
-
Publication No.: US11244116B2Publication Date: 2022-02-08
- Inventor: Brendan Bull , Paul Lewis Felt , Andrew G. Hicks
- Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Agent L. Jeffrey Kelly
- Main IPC: G06F40/30
- IPC: G06F40/30 ; G06N3/08 ; G06N3/04

Abstract:
A computer-implemented method, system and computer program product for automatically bootstrapping a domain-specific vocabulary from at least one source document using one or more computers, by: (a) encoding one or more passages in the source document to identify one or more relevant words therein, wherein the encoding assigns an importance to the relevant words using an attention mechanism (AM) on top of a recurrent neural network (RNN); (b) expanding the relevant words using word embedding distance, ontology information, or multi-part analogies; and (c) mapping the expanded words to concepts for inclusion into the domain-specific vocabulary, wherein concept disambiguation is performed to ensure that incorrect concepts are not included into the domain-specific vocabulary.
Public/Granted literature
- US20210064702A1 AUTOMATICALLY BOOTSTRAPPING A DOMAIN-SPECIFIC VOCABULARY Public/Granted day:2021-03-04
Information query