Speech recognition system using machine learning to classify phone posterior context information and estimate boundaries in speech from combined boundary posteriors

Invention Grant

US10424289B2 Speech recognition system using machine learning to classify phone posterior context information and estimate boundaries in speech from combined boundary posteriors 有权

Please log in to see more content

Patent Title: Speech recognition system using machine learning to classify phone posterior context information and estimate boundaries in speech from combined boundary posteriors
Application No.: US16103251

Application Date: 2018-08-14
Publication No.: US10424289B2

Publication Date: 2019-09-24
Inventor: Ozlem Kalinli-Akbacak
Applicant: Sony Interactive Entertainment Inc.
Applicant Address: JP Tokyo
Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
Current Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
Current Assignee Address: JP Tokyo
Agency: JDI Patent
Agent Joshua Isenberg; Robert Pullman
Main IPC: G10L15/04
IPC: G10L15/04 ; G10L25/30 ; G10L25/03 ; G10L15/16

Speech recognition system using machine learning to classify phone posterior context information and estimate boundaries in speech from combined boundary posteriors

Abstract:

A speech recognition system includes a phone classifier and a boundary classifier. The phone classifier generates combined boundary posteriors from a combination of auditory attention features and phone posteriors by feeding phone posteriors of neighboring frames of an audio signal into a machine learning algorithm to classify phone posterior context information. The boundary classifier estimates boundaries in speech contained in the audio signal from the combined boundary posteriors.

Public/Granted literature

US20190005943A1 SPEECH RECOGNITION SYSTEM USING MACHINE LEARNING TO CLASSIFY PHONE POSTERIOR CONTEXT INFORMATION AND ESTIMATE BOUNDARIES IN SPEECH FROM COMBINED BOUNDARY POSTERIORS Public/Granted day:2019-01-03

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/04	.分段；字极限检测