Hybrid phoneme, diphone, morpheme, and word-level deep neural networks

Invention Grant

US10235991B2 Hybrid phoneme, diphone, morpheme, and word-level deep neural networks 有权

Please log in to see more content

Patent Title: Hybrid phoneme, diphone, morpheme, and word-level deep neural networks
Application No.: US15672486

Application Date: 2017-08-09
Publication No.: US10235991B2

Publication Date: 2019-03-19
Inventor: Jintao Jiang , Hassan Sawaf , Mudar Yaghi
Applicant: Apptek, Inc.
Applicant Address: US VA McLean
Assignee: AppTek, Inc.
Current Assignee: AppTek, Inc.
Current Assignee Address: US VA McLean
Agency: Morgan, Lewis & Bockius LLP
Agent Robert C. Bertin; Rachael Lea Leventhal
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/06 ; G10L15/16 ; G10L15/32 ; G10L15/02 ; G10L15/187 ; G10L25/30

Hybrid phoneme, diphone, morpheme, and word-level deep neural networks

Abstract:

A hybrid frame, phone, diphone, morpheme, and word-level Deep Neural Networks (DNN) in model training and applications-is based on training a regular ASR system, which can be based on Gaussian Mixture Models (GMM) or DNN. All the training data (in the format of features) are aligned with the transcripts in terms of phonemes and words with the timing information and new features are formed in terms of phonemes, diphones, morphemes, and up to words. Regular ASR produces a result lattice with timing information for each word. A feature is then extracted and sent to the word-level DNN for scoring Phoneme features are sent to corresponding DNNs for training. Scores are combined to form the word level scores, a rescored lattice and a new recognition result.

Public/Granted literature

US20180047385A1 HYBRID PHONEME, DIPHONE, MORPHEME, AND WORD-LEVEL DEEP NEURAL NETWORKS Public/Granted day:2018-02-15

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）