Identifying keyword occurrences in audio data

Invention Grant

US08423363B2 Identifying keyword occurrences in audio data 有权

Title translation: 识别音频数据中的关键字出现

Please log in to see more content

Patent Title: Identifying keyword occurrences in audio data
Patent Title (中): 识别音频数据中的关键字出现
Application No.: US12686892

Application Date: 2010-01-13
Publication No.: US08423363B2

Publication Date: 2013-04-16
Inventor: Vishwa Nath Gupta , Gilles Boulianne
Applicant: Vishwa Nath Gupta , Gilles Boulianne
Applicant Address: CA Montréal, Québec
Assignee: CRIM (Centre de Recherche Informatique de Montréal)
Current Assignee: CRIM (Centre de Recherche Informatique de Montréal)
Current Assignee Address: CA Montréal, Québec
Agency: Wilmer Cutler Pickering Hale and Dorr LLP
Main IPC: G10L15/28
IPC: G10L15/28 ; G10L15/00 ; G10L15/18

Identifying keyword occurrences in audio data

Abstract:

Occurrences of one or more keywords in audio data are identified using a speech recognizer employing a language model to derive a transcript of the keywords. The transcript is converted into a phoneme sequence. The phonemes of the phoneme sequence are mapped to the audio data to derive a time-aligned phoneme sequence that is searched for occurrences of keyword phoneme sequences corresponding to the phonemes of the keywords. Searching includes computing a confusion matrix. The language model used by the speech recognizer is adapted to keywords by increasing the likelihoods of the keywords in the language model. For each potential occurrences keywords detected, a corresponding subset of the audio data may be played back to an operator to confirm whether the potential occurrences correspond to actual occurrences of the keywords.

Abstract(Chinese):

音频数据中的一个或多个关键词的发生使用语言识别器来识别，该语音识别器采用语言模型来导出关键词的抄本。誊本被转换为音素序列。音素序列的音素被映射到音频数据以导出搜索与关键字的音素对应的关键词音素序列的出现的时间对齐音素序列。搜索包括计算混淆矩阵。语音识别器使用的语言模型通过增加语言模型中的关键词的可能性来适应关键词。对于检测到的每个潜在的事件关键字，音频数据的相应子集可以被重播到运营商以确认潜在的事件是否对应于关键字的实际出现。

Public/Granted literature

US20100179811A1 IDENTIFYING KEYWORD OCCURRENCES IN AUDIO DATA Public/Granted day:2010-07-15

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/28	.语音识别系统的结构细节