Invention Grant
- Patent Title: Diarization driven by meta-information identified in discussion content
-
Application No.: US15819158Application Date: 2017-11-21
-
Publication No.: US10468031B2Publication Date: 2019-11-05
- Inventor: Kenneth W. Church , Dimitrios B. Dimitriadis , Petr Fousek , Miroslav Novak , George A. Saon
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: VanLeeuwen & VanLeeuwen
- Agent Feb R. Cabrasawan
- Main IPC: G10L17/00
- IPC: G10L17/00 ; G10L15/22 ; G10L15/30 ; G10L15/183 ; G10L25/51 ; G10L25/78

Abstract:
An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. The STD process analyzes a number of speaker segments using a language model that determines when speaker changes occur. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.
Public/Granted literature
- US20190156835A1 Diarization Driven by Meta-Information Identified in Discussion Content Public/Granted day:2019-05-23
Information query