Forecastable supervised labels and corpus sets for training a natural-language processing system

Invention Grant

US10796241B2 Forecastable supervised labels and corpus sets for training a natural-language processing system 有权

Please log in to see more content

Patent Title: Forecastable supervised labels and corpus sets for training a natural-language processing system
Application No.: US14927766

Application Date: 2015-10-30
Publication No.: US10796241B2

Publication Date: 2020-10-06
Inventor: Aaron K. Baughman , Gary F. Diamanti , Mauro Marzorati
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Schmeiser, Olsen & Watts
Agent Michael A. Petrocelli
Main IPC: G06N20/00
IPC: G06N20/00 ; G06N5/04 ; G06F40/30 ; G06F40/42

Abstract:

A method and associated systems for forecastable supervised labels and corpus sets for training a natural-language processing system. An NLP-training system asks an “oracle” expert to answer a predictive test question and, in response, receives from the oracle an answer, rationales for selecting that answer, and identifications of extrinsic natural-language sources of evidence that supports those rationales. The system retrieves updated versions of that evidence at a later time, and returns that updated evidence to the oracle. In response, the oracle returns an updated answer and rationales based on the updated evidence. The system then compares time-varying characteristics of the evidence in order to determine the relative contributions of each piece of evidence to the oracles' selections. Less relevant evidence is discarded and the remaining, optimized, evidence is forwarded to the NLP system to be used as training data.

Public/Granted literature

US20170124479A1 FORECASTABLE SUPERVISED LABELS AND CORPUS SETS FOR TRAINING A NATURAL-LANGUAGE PROCESSING SYSTEM Public/Granted day:2017-05-04

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N20/00	机器学习