Invention Grant
- Patent Title: Generating model training data from a domain specification
-
Application No.: US17505531Application Date: 2021-10-19
-
Publication No.: US12159115B2Publication Date: 2024-12-03
- Inventor: Zeqi Lin , Yu Hu , Haiyuan Cao , Yi Liu , Jian-Guang Lou , Kuralmani Elango , PalaniRaj Kaliyaperumal , Weizhu Chen , Kunal Mukerjee
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Barta Jones, PLLC
- Main IPC: G06F40/35
- IPC: G06F40/35 ; G06F40/186 ; G06F40/211 ; G06N20/00

Abstract:
Examples described herein generate training data for machine learning (ML) for natural language (NL) processing (such as semantic parsing for translating NL). A formula tree is generated based on sampling both a formula grammar and NL templates. Using the formula tree, an ML training data instance pair is generated comprising a formula example and an NL example. A context example may also be used during instantiation of the formula tree. An ML model is trained with training data including the ML training data instance pair, and ML output is generated from NL input. The ML output includes, for example, a machine-interpretable formula, a database querying language command, or a general programming language instruction. Some examples support context-free grammar, probabilistic context-free grammar, and/or non-context-free production rules.
Public/Granted literature
- US20230119613A1 GENERATING MODEL TRAINING DATA FROM A DOMAIN SPECIFICATION Public/Granted day:2023-04-20
Information query