Neural network generative modeling to transform speech utterances and augment training data

Invention Grant

US10937438B2 Neural network generative modeling to transform speech utterances and augment training data 有权

Please log in to see more content

Patent Title: Neural network generative modeling to transform speech utterances and augment training data
Application No.: US15940639

Application Date: 2018-03-29
Publication No.: US10937438B2

Publication Date: 2021-03-02
Inventor: Praveen Narayanan , Lisa Scaria , Francois Charette , Ashley Elizabeth Micks , Ryan Burke
Applicant: Ford Global Technologies, LLC
Applicant Address: US MI Dearborn
Assignee: Ford Global Technologies, LLC
Current Assignee: Ford Global Technologies, LLC
Current Assignee Address: US MI Dearborn
Agency: Stevens Law Group
Agent David R. Stevens
Main IPC: G10L21/02
IPC: G10L21/02 ; G10L15/16 ; G10L25/03 ; G10L15/06 ; G06F3/16 ; G06N5/04

Neural network generative modeling to transform speech utterances and augment training data

Abstract:

Systems, methods, and devices for speech transformation and generating synthetic speech using deep generative models are disclosed. A method of the disclosure includes receiving input audio data comprising a plurality of iterations of a speech utterance from a plurality of speakers. The method includes generating an input spectrogram based on the input audio data and transmitting the input spectrogram to a neural network configured to generate an output spectrogram. The method includes receiving the output spectrogram from the neural network and, based on the output spectrogram, generating synthetic audio data comprising the speech utterance.

Public/Granted literature

US20190304480A1 Neural Network Generative Modeling To Transform Speech Utterances And Augment Training Data Public/Granted day:2019-10-03

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）
G10L21/02	.语音增强，例如降低噪声或消除回声（在直线传送系统中减轻回声效应入H04B3/20；免提电话中的回声抑制入H04M9/08）