Precision speech to text conversion

Invention Grant

US08041565B1 Precision speech to text conversion 有权

Title translation: 精准语音转换文本

Please log in to see more content

Patent Title: Precision speech to text conversion
Patent Title (中): 精准语音转换文本
Application No.: US11763943

Application Date: 2007-06-15
Publication No.: US08041565B1

Publication Date: 2011-10-18
Inventor: Vinod K. Bhardwaj , Scott England , Dean Whitlock
Applicant: Vinod K. Bhardwaj , Scott England , Dean Whitlock
Applicant Address: US CA Santa Clara
Assignee: FoneWeb, Inc.
Current Assignee: FoneWeb, Inc.
Current Assignee Address: US CA Santa Clara
Agency: Beyer Law Group LLP
Main IPC: G10L15/26
IPC: G10L15/26

Abstract:

A speech-to-text conversion module uses a central database of user speech profiles to convert speech to text. Incoming audio information is fragmented into numerous audio fragments based upon detecting silence. The audio information is also converted to numerous text files by any number of speech engines. Each text file is then fragmented into numerous text fragments based upon the boundaries established during the audio fragmentation. Each set of text fragments from the different speech engines corresponding to a single audio fragments is then compared. The best approximation of the audio fragment is produced from the set of text fragments; a hybrid may be produced. If no agreement is reached, the audio fragment and set the text fragments are sent to human agents who verify and edit to produce a final edited text fragment that best corresponds to the audio fragment. Fragmentation that produces overlapping audio fragments requires splicing of the final text fragments to produce the output text file.

Abstract(Chinese):

语音到文本转换模块使用用户语音简档的中央数据库将语音转换成文本。基于检测到静音，进入的音频信息被分割成许多音频片段。音频信息也通过任何数量的语音引擎转换成许多文本文件。然后，基于在音频分段期间建立的边界，将每个文本文件分段成多个文本片段。然后比较来自对应于单个音频片段的不同语音引擎的每组文本片段。音频片段的最佳近似是从文本片段集合中产生的; 可以产生杂交体。如果未达成协议，则将音频片段并设置文本片段发送给验证和编辑的人员，以产生最佳对应于音频片段的最终编辑的文本片段。产生重叠音频片段的碎片需要拼接最终的文本片段以产生输出文本文件。

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/26	.语音—正文识别系统（G10L15/08优先）