Method and system for generating synthetic multi-conditioned data sets for robust automatic speech recognition

Invention Grant

US11335329B2 Method and system for generating synthetic multi-conditioned data sets for robust automatic speech recognition 有权

Please log in to see more content

Patent Title: Method and system for generating synthetic multi-conditioned data sets for robust automatic speech recognition
Application No.: US16827863

Application Date: 2020-03-24
Publication No.: US11335329B2

Publication Date: 2022-05-17
Inventor: Meetkumar Hemakshu Soni , Sonal Joshi , Ashish Panda
Applicant: Tata Consultancy Services Limited
Applicant Address: IN Mumbai
Assignee: Tata Consultancy Services Limited
Current Assignee: Tata Consultancy Services Limited
Current Assignee Address: IN Mumbai
Agency: Finnegan, Henderson, Farabow, Garrett & Dunner, LLP
Priority: IN201921034591 20190828
Main IPC: G10L15/06
IPC: G10L15/06 ; G10L15/04 ; G10L21/0208 ; G06K9/62 ; G06N20/00

Method and system for generating synthetic multi-conditioned data sets for robust automatic speech recognition

Abstract:

Performance of Automatic Speech Recognition (ASR) for robustness against real world noises and channel distortions is critical. Embodiments herein provide method and system for generating synthetic multi-conditioned data sets for additive noise and channel distortion for training multi-conditioned acoustic models for robust ASR. The method provides a generative noise model generating plurality of types of noise signals for additive noise based on weighted linear combination of plurality of noise basis signals and channel distortion based on estimated channel responses. The generative noise model is a parametric model, wherein basis function selection, number of basis functions to be combined linearly and weightages to be applied to the combinations is tunable, thereby enabling generation of wide variety of noise signals. Further, the noise signals are added to set of training speech utterances under set of constraints providing the multi-conditioned data sets, imitating real world effects.

Public/Granted literature

US20210065681A1 METHOD AND SYSTEM FOR GENERATING SYNTHETIC MULTI-CONDITIONED DATA SETS FOR ROBUST AUTOMATIC SPEECH RECOGNITION Public/Granted day:2021-03-04

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）