Invention Grant
- Patent Title: Synthetic data generation for training of natural language understanding models
-
Application No.: US17021892Application Date: 2020-09-15
-
Publication No.: US11508360B2Publication Date: 2022-11-22
- Inventor: Baolin Peng , Chenguang Zhu , Chunyuan Li , Xiujun Li , Jinchao Li , Nanshan Zeng , Jianfeng Gao
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Rainier Patents, P.S.
- Main IPC: G10L15/18
- IPC: G10L15/18 ; G10L15/22 ; G10L15/08

Abstract:
This document relates to machine learning. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining a task-adapted generative model that has been tuned using one or more task-specific seed examples. The method or technique can also include inputting dialog acts into the task-adapted generative model and obtaining synthetic utterances that are output by the task-adapted generative model. The method or technique can also include populating a synthetic training corpus with synthetic training examples that include the synthetic utterances. The synthetic training corpus may be suitable for training a natural language understanding model.
Public/Granted literature
- US20220084510A1 SYNTHETIC DATA GENERATION FOR TRAINING OF NATURAL LANGUAGE UNDERSTANDING MODELS Public/Granted day:2022-03-17
Information query