Invention Grant
- Patent Title: ML using n-gram induced input representation
-
Application No.: US17229140Application Date: 2021-04-13
-
Publication No.: US11836438B2Publication Date: 2023-12-05
- Inventor: Pengcheng He , Xiaodong Liu , Jianfeng Gao , Weizhu Chen
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Schwegman Lundberg & Woessner, P.A.
- Main IPC: G06F40/126
- IPC: G06F40/126 ; G06N3/08 ; G06F40/151

Abstract:
Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-based induced embedding of each token to generate the masked word prediction.
Public/Granted literature
- US20210232753A1 ML USING N-GRAM INDUCED INPUT REPRESENTATION Public/Granted day:2021-07-29
Information query