Invention Grant
- Patent Title: Method and system for generating synthetic multi-conditioned data sets for robust automatic speech recognition
-
Application No.: US16827863Application Date: 2020-03-24
-
Publication No.: US11335329B2Publication Date: 2022-05-17
- Inventor: Meetkumar Hemakshu Soni , Sonal Joshi , Ashish Panda
- Applicant: Tata Consultancy Services Limited
- Applicant Address: IN Mumbai
- Assignee: Tata Consultancy Services Limited
- Current Assignee: Tata Consultancy Services Limited
- Current Assignee Address: IN Mumbai
- Agency: Finnegan, Henderson, Farabow, Garrett & Dunner, LLP
- Priority: IN201921034591 20190828
- Main IPC: G10L15/06
- IPC: G10L15/06 ; G10L15/04 ; G10L21/0208 ; G06K9/62 ; G06N20/00

Abstract:
Performance of Automatic Speech Recognition (ASR) for robustness against real world noises and channel distortions is critical. Embodiments herein provide method and system for generating synthetic multi-conditioned data sets for additive noise and channel distortion for training multi-conditioned acoustic models for robust ASR. The method provides a generative noise model generating plurality of types of noise signals for additive noise based on weighted linear combination of plurality of noise basis signals and channel distortion based on estimated channel responses. The generative noise model is a parametric model, wherein basis function selection, number of basis functions to be combined linearly and weightages to be applied to the combinations is tunable, thereby enabling generation of wide variety of noise signals. Further, the noise signals are added to set of training speech utterances under set of constraints providing the multi-conditioned data sets, imitating real world effects.
Public/Granted literature
Information query