Invention Grant
- Patent Title: Ingesting documents using multiple ingestion pipelines
-
Application No.: US14728050Application Date: 2015-06-02
-
Publication No.: US10318591B2Publication Date: 2019-06-11
- Inventor: Pamela D. Andrejko , Andrew R. Freed , Cynthia M. Murch , Jan M. Nordland , Humberto R. Rivero
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Nicholas D. Bowman
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F16/93 ; G06F17/24 ; G06F16/33

Abstract:
A primary ingestion pipeline configured for use in natural language processing includes annotators configured for annotating documents. The annotators and documents to be annotated are evaluated. Based on the evaluations, an ingestion risk score is generated for each document. Each ingestion risk score represents a likelihood that an associated document will not successfully be annotated by the annotators. Each ingestion risk score is compared to a set of risk criteria. Based on the comparisons, a determination is made that each document of a first set of documents satisfies the set of risk criteria. A further determination is made, based on the comparisons, that each document of a second set of documents does not satisfy the set of risk criteria. In response to these determinations, the first set of documents is entered into the primary ingestion pipeline and the second set of documents is provided special handling.
Public/Granted literature
- US20160359894A1 INGESTING DOCUMENTS USING MULTIPLE INGESTION PIPELINES Public/Granted day:2016-12-08
Information query