Training and/or using a language selection model for automatically determining language for speech recognition of spoken utterance

Invention Grant

US11646011B2 Training and/or using a language selection model for automatically determining language for speech recognition of spoken utterance 有权

Please log in to see more content

Patent Title: Training and/or using a language selection model for automatically determining language for speech recognition of spoken utterance
Application No.: US17846287

Application Date: 2022-06-22
Publication No.: US11646011B2

Publication Date: 2023-05-09
Inventor: Li Wan , Yang Yu , Prashant Sridhar , Ignacio Lopez Moreno , Quan Wang
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: GOOGLE LLC
Current Assignee: GOOGLE LLC
Current Assignee Address: US CA Mountain View
Agency: Gray Ice Higdon
Main IPC: G10L15/00
IPC: G10L15/00

Training and/or using a language selection model for automatically determining language for speech recognition of spoken utterance

Abstract:

Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.

Public/Granted literature

US20220328035A1 TRAINING AND/OR USING A LANGUAGE SELECTION MODEL FOR AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE Public/Granted day:2022-10-13

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）