Invention Grant
- Patent Title: Preparing documents for coreference analysis
-
Application No.: US17139147Application Date: 2020-12-31
-
Publication No.: US11556574B2Publication Date: 2023-01-17
- Inventor: Anton Yegorin
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Brian D. Welle
- Main IPC: G06F16/33
- IPC: G06F16/33 ; G06F16/332 ; G06F40/247

Abstract:
Unstructured text is identified as larger than a threshold size. Named-entity recognition analysis is executed on the unstructured text. One or more anchor entities of the unstructured text are determined that each occur more than a threshold amount of times within the unstructured text. Two or more instances of the one or more anchor entities that are separated by at least a threshold amount of text of the unstructured text are identified. The unstructured text is partitioned into at least three sections. The unstructured text is partitioned at respective natural language demarcation points associated with each of the two or more instances such that each of the at least three sections is smaller than the threshold size. Separate coreference analyses are performed in parallel on each of the at least three sections.
Public/Granted literature
- US20220207065A1 PREPARING DOCUMENTS FOR COREFERENCE ANALYSIS Public/Granted day:2022-06-30
Information query