Automatic audio captioning

Invention Grant

US10679643B2 Automatic audio captioning 有权

Please log in to see more content

Patent Title: Automatic audio captioning
Application No.: US15691546

Application Date: 2017-08-30
Publication No.: US10679643B2

Publication Date: 2020-06-09
Inventor: Gregory Frederick Diamos , Sudnya Diamos , Michael Allen Evans
Applicant: Gregory Frederick Diamos , Sudnya Diamos , Michael Allen Evans
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L21/10 ; G10L15/24 ; G10L15/16 ; G10L15/02 ; G10L15/06 ; G10L19/00 ; G10L15/183 ; G10L25/30

Abstract:

A method, computer readable medium, and system are disclosed for audio captioning. A raw audio waveform including a non-speech sound is received and relevant features are extracted from the raw audio waveform using a recurrent neural network (RNN) acoustic model. A discrete sequence of characters represented in a natural language is generated based on the relevant features, where the discrete sequence of characters comprises a caption that describes the non-speech sound.

Public/Granted literature

US20180061439A1 AUTOMATIC AUDIO CAPTIONING Public/Granted day:2018-03-01

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）