Systems and methods for machine learning-based multi-intent segmentation and classification

Invention Grant

US10824818B2 Systems and methods for machine learning-based multi-intent segmentation and classification 有权

Please log in to see more content

Patent Title: Systems and methods for machine learning-based multi-intent segmentation and classification
Application No.: US16854834

Application Date: 2020-04-21
Publication No.: US10824818B2

Publication Date: 2020-11-03
Inventor: Joseph Peper , Parker Hill , Kevin Leach , Sean Stapleton , Jonathan K. Kummerfeld , Johann Hauswald , Michael Laurenzano , Lingjia Tang , Jason Mars
Applicant: Clinc, Inc.
Applicant Address: US MI Ann Arbor
Assignee: Clinc, Inc.
Current Assignee: Clinc, Inc.
Current Assignee Address: US MI Ann Arbor
Agent Padowithz Alce
Main IPC: G06F40/30
IPC: G06F40/30 ; G06N7/00 ; G06N20/00 ; G06F40/284 ; G10L15/18 ; G10L15/06 ; G10L15/16

Systems and methods for machine learning-based multi-intent segmentation and classification

Abstract:

Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utterance from the second corpus of utterances and the second in-domain utterance from the first corpus of utterances.

Public/Granted literature

US20200257857A1 SYSTEMS AND METHODS FOR MACHINE LEARNING-BASED MULTI-INTENT SEGMENTATION AND CLASSIFICATION Public/Granted day:2020-08-13

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F40/00	处理自然语言数据（语音分析或综合，语音识别G10L）
G06F40/30	.语义分析