Invention Grant
- Patent Title: Systems and methods for intelligently curating machine learning training data and improving machine learning model performance
-
Application No.: US16379978Application Date: 2019-04-10
-
Publication No.: US10679100B2Publication Date: 2020-06-09
- Inventor: Yiping Kang , Yunqi Zhang , Jonathan K. Kummerfeld , Parker Hill , Johann Hauswald , Michael A. Laurenzano , Lingjia Tang , Jason Mars
- Applicant: Clinc, Inc.
- Applicant Address: US MI Ann Arbor
- Assignee: Clinc, Inc.
- Current Assignee: Clinc, Inc.
- Current Assignee Address: US MI Ann Arbor
- Agent Padowithz Alce
- Main IPC: G06N5/04
- IPC: G06N5/04 ; G06K9/62 ; G06N20/00 ; G06F16/332

Abstract:
Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.
Public/Granted literature
Information query