Invention Grant
- Patent Title: End-to-end speech recognition with policy learning
-
Application No.: US16562257Application Date: 2019-09-05
-
Publication No.: US11056099B2Publication Date: 2021-07-06
- Inventor: Yingbo Zhou , Caiming Xiong
- Applicant: salesforce.com, inc.
- Applicant Address: US CA San Francisco
- Assignee: salesforce.com, inc.
- Current Assignee: salesforce.com, inc.
- Current Assignee Address: US CA San Francisco
- Agency: Haynes and Boone, LLP
- Main IPC: G10L15/06
- IPC: G10L15/06 ; G06N3/08 ; G10L15/14 ; G10L15/16 ; G06N3/04 ; G06N7/00 ; G10L25/51

Abstract:
The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions. The multi-objective learning criteria updates model parameters of the model over one thousand to millions of backpropagation iterations by combining, at each iteration, a maximum likelihood objective function that modifies the model parameters to maximize a probability of outputting a correct transcription and a policy gradient function that modifies the model parameters to maximize a positive reward defined based on a non-differentiable performance metric which penalizes incorrect transcriptions in accordance with their conformity to corresponding ground truth transcriptions; and upon convergence after a final backpropagation iteration, persisting the modified model parameters learned by using the multi-objective learning criteria with the model to be applied to further end-to-end speech recognition.
Public/Granted literature
- US20200005765A1 End-To-End Speech Recognition with Policy Learning Public/Granted day:2020-01-02
Information query