Multilingual re-scoring models for automatic speech recognition

Invention Grant

US12254875B2 Multilingual re-scoring models for automatic speech recognition 有权

Please log in to see more content

Patent Title: Multilingual re-scoring models for automatic speech recognition
Application No.: US18589220

Application Date: 2024-02-27
Publication No.: US12254875B2

Publication Date: 2025-03-18
Inventor: Neeraj Gaur , Tongzhou Chen , Ehsan Variani , Bhuvana Ramabhadran , Parisa Haghani , Pedro J. Moreno Mengibar
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Honigman LLP
Agent Brett A. Krueger; Grant Griffith
Main IPC: G10L15/197
IPC: G10L15/197 ; G10L15/00 ; G10L15/16 ; G10L15/22

Multilingual re-scoring models for automatic speech recognition

Abstract:

A method includes receiving a sequence of acoustic frames extracted from audio data corresponding to an utterance. During a first pass, the method includes processing the sequence of acoustic frames to generate N candidate hypotheses for the utterance. During a second pass, and for each candidate hypothesis, the method includes: generating a respective un-normalized likelihood score; generating a respective external language model score; generating a standalone score that models prior statistics of the corresponding candidate hypothesis; and generating a respective overall score for the candidate hypothesis based on the un-normalized likelihood score, the external language model score, and the standalone score. The method also includes selecting the candidate hypothesis having the highest respective overall score from among the N candidate hypotheses as a final transcription of the utterance.

Public/Granted literature

US20240203409A1 Multilingual Re-Scoring Models for Automatic Speech Recognition Public/Granted day:2024-06-20

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/18	..利用自然语言模型
G10L15/183	...用上下文相关性，例如：语言模型
G10L15/19	....语法上下文，例如：基于字母顺序规则的识别假定的消除二义性
G10L15/197	.....概率文法，例如：字元语法