Invention Grant
- Patent Title: Ontology-based document analysis and annotation generation
-
Application No.: US16270431Application Date: 2019-02-07
-
Publication No.: US10909320B2Publication Date: 2021-02-02
- Inventor: Brendan Bull , Paul Lewis Felt , Andrew Hicks
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Patterson + Sheridan, LLP
- Main IPC: G06F17/20
- IPC: G06F17/20 ; G06F40/289 ; G06F16/93 ; G06F40/30

Abstract:
Techniques for cognitive annotation are provided. An electronic document including textual data is received. A plurality of importance scores are generated for a plurality of words included in the electronic document by processing the electronic document using a trained passage encoder. Important words are identified based on the plurality of importance scores. One or more clusters of words are generated, where each of the one or more clusters of words includes at least one of the plurality of important words. A representative word is selected for a first cluster, and the representative word is mapped to one or more concepts from a predefined list of concepts. The one or more concepts are disambiguated to identify a set of relevant concepts for the electronic document. An annotated version of the electronic document is generated based at least in part on the set of relevant concepts.
Public/Granted literature
- US20200257761A1 ONTOLOGY-BASED DOCUMENT ANALYSIS AND ANNOTATION GENERATION Public/Granted day:2020-08-13
Information query