Invention Publication
- Patent Title: TIME-VARYING AND NONLINEAR AUDIO PROCESSING USING DEEP NEURAL NETWORKS
-
Application No.: US17924701Application Date: 2020-05-12
-
Publication No.: US20230197043A1Publication Date: 2023-06-22
- Inventor: Marco Antonio MARTINEZ RAMIREZ , Joshua Daniel REISS , Emmanouil BENETOS
- Applicant: QUEEN MARY UNIVERSITY OF LONDON
- Applicant Address: GB London, Greater London
- Assignee: QUEEN MARY UNIVERSITY OF LONDON
- Current Assignee: QUEEN MARY UNIVERSITY OF LONDON
- Current Assignee Address: GB London, Greater London
- International Application: PCT/GB2020/051150 2020.05.12
- Date entered country: 2022-11-11
- Main IPC: G10H1/00
- IPC: G10H1/00 ; G10H1/16 ; G06N3/045 ; G06N3/0442 ; G06N3/0499 ; G06N3/08

Abstract:
A computer-implemented method of processing audio data, the method comprising receiving input audio data (x) comprising a time-series of amplitude values; transforming the input audio data (x) into an input frequency band decomposition (X1) of the input audio data (x); transforming the input frequency band decomposition (X1) into a first latent representation (Z); processing the first latent representation (Z) by a first deep neural network to obtain a second latent representation (Z{circumflex over ( )}, Z1{circumflex over ( )}); transforming the second latent representation (Z{circumflex over ( )}, Z1{circumflex over ( )}) to obtain a discrete approximation (X3{circumflex over ( )}); element-wise multiplying the discrete approximation (X3{circumflex over ( )}) and a residual feature map (R, X5{circumflex over ( )}) to obtain a modified feature map, wherein the residual feature map (R, X5{circumflex over ( )}) is derived from the input frequency band decomposition (X1); processing a pre-shaped frequency band decomposition by a waveshaping unit to obtain a waveshaped frequency band decomposition (X1{circumflex over ( )}, X1.2{circumflex over ( )}), wherein the pre-shaped frequency band decomposition is derived from the input frequency band decomposition (X1), wherein the waveshaping unit comprises a second deep neural network; summing the waveshaped frequency band decomposition (X1{circumflex over ( )}, X1.2{circumflex over ( )}) and a modified frequency band decomposition (X2{circumflex over ( )}, X1.1{circumflex over ( )}) to obtain a summation output (X0{circumflex over ( )}), wherein the modified frequency band decomposition (X2{circumflex over ( )}, X1.1{circumflex over ( )}) is derived from the modified feature map; and transforming the summation output (X0{circumflex over ( )}) to obtain target audio data (y{circumflex over ( )}).
Public/Granted literature
- US12334043B2 Time-varying and nonlinear audio processing using deep neural networks Public/Granted day:2025-06-17
Information query