Automated data and label creation for supervised machine learning regression testing

Invention Grant

US11295242B2 Automated data and label creation for supervised machine learning regression testing 有权

Please log in to see more content

Patent Title: Automated data and label creation for supervised machine learning regression testing
Application No.: US16682946

Application Date: 2019-11-13
Publication No.: US11295242B2

Publication Date: 2022-04-05
Inventor: Yuan-Chi Chang , Deepak Srinivas Turaga , Long Vu , Venkata Nagaraju Pavuluri , Saket Sathe , Rodrigue Ngueyep Tzoumpe
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Otterstedt, Wallace & Kammer, LLP
Agent Anthony Curro
Main IPC: G06N20/10
IPC: G06N20/10 ; G06F17/18 ; G06K9/62 ; G06N3/02 ; G06N3/08 ; G06N3/04 ; G06N5/00 ; G06N20/20

Automated data and label creation for supervised machine learning regression testing

Abstract:

Split an input dataset into training and test datasets; the former includes a plurality of data examples, each represented as a feature vector, and having an associated true label. Split the training dataset into a plurality of training data subsets; for each, train a corresponding machine learning model to obtain a plurality of such models, and apply same to the test dataset to obtain a plurality of predicted labels and prediction scores. For each of the plurality of examples, compute an agreement metric based on a corresponding one of the associated true labels; corresponding ones of the predicted labels; and corresponding ones of the prediction scores. Based on the computed metric, select, for at least some of the true label values, appropriate ones of the data examples to be added to a regression set. Add the appropriate ones of the data examples from the test dataset to the regression set.

Public/Granted literature

US20210142222A1 AUTOMATED DATA AND LABEL CREATION FOR SUPERVISED MACHINE LEARNING REGRESSION TESTING Public/Granted day:2021-05-13

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N20/00	机器学习
G06N20/10	.•使用核方法，例如支持向量机