Invention Grant
- Patent Title: Speech segmentation based on combination of pause detection and speaker diarization
-
Application No.: US17851264Application Date: 2022-06-28
-
Publication No.: US11538481B2Publication Date: 2022-12-27
- Inventor: Xiaolong Li , Samuel Norris Henderson , Xiaozhuo Cheng , Xu Yang
- Applicant: SAS Institute Inc.
- Applicant Address: US NC Cary
- Assignee: SAS Institute Inc.
- Current Assignee: SAS Institute Inc.
- Current Assignee Address: US NC Cary
- Agency: KDB Firm PLLC
- Main IPC: G10L15/04
- IPC: G10L15/04 ; G10L15/16 ; G10L15/26 ; G10L25/78 ; G10L25/30 ; G10L15/02

Abstract:
An apparatus includes at least one processor to, in response to a request to perform speech-to-text conversion: perform a pause detection technique including analyzing speech audio to identify pauses, and analyzing lengths of the pauses to identify likely sentence pauses; perform a speaker diarization technique including dividing the speech audio into fragments, analyzing vocal characteristics of speech sounds of each fragment to identify a speaker of a set of speakers, and identifying instances of a change in speakers between each temporally consecutive pair of fragments to identify likely speaker changes; and perform speech-to-text operations including dividing the speech audio into segments based on at least the likely sentence pauses and likely speaker changes, using at least an acoustic model with each segment to identify likely speech sounds in the speech audio, and generating a transcript of the speech audio based at least on the likely speech sounds.
Public/Granted literature
- US20220335947A1 SPEECH SEGMENTATION BASED ON COMBINATION OF PAUSE DETECTION AND SPEAKER DIARIZATION Public/Granted day:2022-10-20
Information query