Deep neural network based audio processing method, device and storage medium

Invention Grant

US11270688B2 Deep neural network based audio processing method, device and storage medium 有权

Please log in to see more content

Patent Title: Deep neural network based audio processing method, device and storage medium
Application No.: US16930337

Application Date: 2020-07-16
Publication No.: US11270688B2

Publication Date: 2022-03-08
Inventor: Congxi Lu , Linkai Li , Hongcheng Sun , Xinke Liu
Applicant: EVOCO LABS CO., LTD.
Applicant Address: CN Shanghai
Assignee: EVOCO LABS CO., LTD.
Current Assignee: EVOCO LABS CO., LTD.
Current Assignee Address: CN Shanghai
Agency: Jun He Law Offices P.C.
Agent Zhaohui Wang
Priority: CN201910843603.0 20190906
Main IPC: G10L15/06
IPC: G10L15/06 ; G06F17/14 ; G06N3/04 ; G06N3/08 ; G10L15/16 ; G10L19/005 ; G10L19/083 ; G10L25/30 ; H04R25/00 ; G10L21/0208 ; G10L21/0364

Deep neural network based audio processing method, device and storage medium

Abstract:

A deep neural network based audio processing method is provided. The method includes: obtaining a deep neural network based speech extraction model; receiving an audio input object having a speech portion and a non-speech portion, wherein the audio input object includes one or more audio data frames each having a set of audio data samples sampled at a predetermined sampling interval and represented in time domain data format; obtaining a user audiogram and a set of user gain compensation coefficients associated with the user audiogram; and inputting the audio input object and the set of user gain compensation coefficients into the trained speech extraction model to obtain an audio output result represented in time domain data format outputted by the trained speech extraction model, wherein the non-speech portion of the audio input object is at least partially attenuated in or removed from the audio output result.

Public/Granted literature

US20210074266A1 DEEP NEURAL NETWORK BASED AUDIO PROCESSING METHOD, DEVICE AND STORAGE MEDIUM Public/Granted day:2021-03-11

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）