Automatically identifying speakers in real-time through media processing with dialog understanding supported by AI techniques

Invention Grant

US10762906B2 Automatically identifying speakers in real-time through media processing with dialog understanding supported by AI techniques 有权

Please log in to see more content

Patent Title: Automatically identifying speakers in real-time through media processing with dialog understanding supported by AI techniques
Application No.: US15967829

Application Date: 2018-05-01
Publication No.: US10762906B2

Publication Date: 2020-09-01
Inventor: Marcio Ferreira Moreno , Helon Vicente Hultmann Ayala , Daniel Salles Chevitarese , Rafael R. de Mello Brandao , Renato Fontoura de Gusmao Cerqueira
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Scully, Scott, Murphy & Presser, P.C.
Agent Joseph Petrokaitis
Main IPC: G10L17/22
IPC: G10L17/22 ; G10L17/26 ; G10L15/26

Automatically identifying speakers in real-time through media processing with dialog understanding supported by AI techniques

Abstract:

Automatically identifying speakers in real-time through media processing with dialog understanding. A plurality of audio streams may be received, an audio stream representing a speech of a participant speaking during an online meeting. A voice characteristic of a voice corresponding to the speech of the participant in the audio stream may be determined. The plurality of audio streams may be converted into text and a natural language processing may be performed to determine content context of the dialog. The natural language processing infers a name to associate with the voice in the audio stream based on the determined content context. A data structure linking the name with the voice may be created and stored in a knowledge base. A user interface associated with the online meeting application is triggered to present the name or identity of the speaker.

Public/Granted literature

US20190341059A1 AUTOMATICALLY IDENTIFYING SPEAKERS IN REAL-TIME THROUGH MEDIA PROCESSING WITH DIALOG UNDERSTANDING SUPPORTED BY AI TECHNIQUES Public/Granted day:2019-11-07

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/22	.交互程序，人-机界面