Multithreaded speech data preprocessing

Invention Grant

US11862171B2 Multithreaded speech data preprocessing 有权

Please log in to see more content

Patent Title: Multithreaded speech data preprocessing
Application No.: US17993385

Application Date: 2022-11-23
Publication No.: US11862171B2

Publication Date: 2024-01-02
Inventor: Xiaolong Li , Xiaozhuo Cheng , Samuel Norris Henderson , Xu Yang
Applicant: SAS Institute Inc.
Applicant Address: US NC Cary
Assignee: SAS Institute Inc.
Current Assignee: SAS Institute Inc.
Current Assignee Address: US NC Cary
Agency: KDW FIRM PLLC
Main IPC: G10L15/22
IPC: G10L15/22 ; G10L15/26 ; G10L15/04 ; G10L25/78 ; G10L25/30 ; G10L15/02

Abstract:

An apparatus includes a processor to: receive, from a requesting device, a request to perform speech-to-text conversion of a speech data set; within a first thread of a thread pool, perform a first pause detection technique to identify a first set of likely sentence pauses; within a second thread of the thread pool, perform a second pause detection technique to identify a second set of likely sentence pauses; perform a speaker diarization technique to identify a set of likely speaker changes; divide the speech data set into data segments representing speech segments based on a combination of at least the first set of likely sentence pauses, the second set of likely sentence pauses, and the set of likely speaker changes; use at least an acoustic model with each data segment to identify likely speech sounds; and generate a transcript based, at least in part, on the identified likely speech sounds.

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/22	.在语音识别过程中（例如在人机对话过程中）使用的程序