Using multiple trained models to reduce data labeling efforts

Invention Grant

US11714802B2 Using multiple trained models to reduce data labeling efforts 有权

Please log in to see more content

Patent Title: Using multiple trained models to reduce data labeling efforts
Application No.: US17221661

Application Date: 2021-04-02
Publication No.: US11714802B2

Publication Date: 2023-08-01
Inventor: Matthew Shreve , Francisco E. Torres , Raja Bala , Robert R. Price , Pei Li
Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
Applicant Address: US CA Palo Alto
Assignee: Palo Alto Research Center Incorporated
Current Assignee: Palo Alto Research Center Incorporated
Current Assignee Address: US CA Palo Alto
Agency: Womble Bond Dickinson (US) LLP
Main IPC: G06F16/00
IPC: G06F16/00 ; G06F16/23 ; G06N20/00

Using multiple trained models to reduce data labeling efforts

Abstract:

A method of labeling a dataset of input samples for a machine learning task includes selecting a plurality of pre-trained machine learning models that are related to a machine learning task. The method further includes processing a plurality of input data samples through each of the pre-trained models to generate a set of embeddings. The method further includes generating a plurality of clusterings from the set of embeddings. The method further includes analyzing, by a processing device, the plurality of clusterings to extract superclusters. The method further includes assigning pseudo-labels to the input samples based on analysis.

Public/Granted literature

US20220318229A1 USING MULTIPLE TRAINED MODELS TO REDUCE DATA LABELING EFFORTS Public/Granted day:2022-10-06

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构