Patent search ap:("MICROSOFT CORPORATION") AND inv:"GONG Page Yifan"

1.

发明申请
LEARNING STUDENT DNN VIA OUTPUT DISTRIBUTION 审中-公开
Title translation: 学习DNN通过输出分配

公开(公告)号：WO2016037350A1

公开(公告)日：2016-03-17

申请号：PCT/CN2014/086397

申请日：2014-09-12

Applicant: MICROSOFT CORPORATION , ZHAO, Rui , HUANG, Jui-Ting , LI, Jinyu , GONG, Yifan

Inventor： ZHAO, Rui , HUANG, Jui-Ting , LI, Jinyu , GONG, Yifan

IPC: G06K9/66

CPC classification number: G06N3/084 , G06N3/0454 , G06N7/005 , G06N99/005 , G09B5/00

Abstract: Systems and methods are provided for generating a DNN classifier by "learning" a "student" DNN model from a larger, more accurate "teacher" DNN model. The student DNN may be trained from unlabeled training data by passing the unlabeled training data through the teacher DNN, which may be trained from labeled data. In one embodiment, an iterative processis applied to train the student DNN by minimizing the divergence of the output distributions from the teacher and student DNN models. For each iteration until convergence, the difference in the outputs of these two DNNsis used to update the student DNN model, and outputs are determined again, using the unlabeled training data. The resulting trained student DNN model may be suitable for providing accurate signal processing applications on devices having limited computational or storage resources such as mobile or wearable devices. In an embodiment, the teacher DNN model comprises an ensemble of DNN models.

Abstract translation: 提供了通过从更大，更准确的“教师”DNN模型学习“学生”DNN模型来生成DNN分类器的系统和方法。通过传递未标记的训练数据通过教师DNN，可以从未标记的训练数据训练学生DNN，该DNN可以从标记数据中训练。在一个实施例中，迭代过程被应用于通过最小化来自教师和学生DNN模型的输出分布的差异来训练学生DNN。对于每次迭代直到收敛，这两个DNNsis的输出的差异用于更新学生DNN模型，并且使用未标记的训练数据再次确定输出。所得到的训练有素的学生DNN模型可能适合于在具有有限计算或存储资源的设备（例如移动或可穿戴设备）上提供精确的信号处理应用。在一个实施例中，教师DNN模型包括DNN模型的集合。

2.

发明申请
MULTILINGUAL DEEP NEURAL NETWORK 审中-公开
Title translation: 多层神经网络

公开(公告)号：WO2014164080A1

公开(公告)日：2014-10-09

申请号：PCT/US2014/020448

申请日：2014-03-05

Applicant: MICROSOFT CORPORATION

Inventor： HUANG, Jui-Ting , LI, Jinyu , YU, Dong , DENG, Li , GONG, Yifan

IPC: G10L15/16

CPC classification number: G10L15/063 , G06N3/0454 , G06N3/084 , G10L15/16

Abstract: Described herein are various technologies pertaining to a multilingual deep neural network (MDNN). The MDNN includes a plurality of hidden layers, wherein values for weight parameters of the plurality of hidden layers are learned during a training phase based upon training data in terms of acoustic raw features for multiple languages. The MDNN further includes softmax layers that are trained for each target language separately, making use of the hidden layer values trained jointly with multiple source languages. The MDNN is adaptable, such that a new softmax layer may be added on top of the existing hidden layers, where the new softmax layer corresponds to a new target language.

Abstract translation: 这里描述的是涉及多语言深层神经网络（MDNN）的各种技术。 MDNN包括多个隐藏层，其中基于针对多种语言的声学原始特征的训练数据，在训练阶段期间学习多个隐藏层的权重参数的值。 MDNN还包括针对每种目标语言分别训练的softmax层，利用与多种源语言联合训练的隐藏层值。 MDNN是适应性的，使得可以在现有隐藏层之上添加新的softmax层，其中新的softmax层对应于新的目标语言。

3.

发明申请
POSTERIOR-BASED FEATURE WITH PARTIAL DISTANCE ELIMINATION FOR SPEECH RECOGNITION 审中-公开
Title translation: 具有基于语音识别的局部距离消除的基于特征的特征

公开(公告)号：WO2014137760A2

公开(公告)日：2014-09-12

申请号：PCT/US2014/019147

申请日：2014-02-27

Applicant: MICROSOFT CORPORATION

Inventor： LI, Jinyu , YAN, Zhijie , HUO, Qiang , GONG, Yifan

IPC: G10L15/22

CPC classification number: G10L15/14 , G10L15/10

Abstract: A high-dimensional posterior-based feature with partial distance elimination may be utilized for speech recognition. The log likelihood values of a large number of Gaussians are needed to generate the high-dimensional posterior feature. Gaussians with very small log likelihoods are associated with zero posterior values. Log likelihoods for Gaussians for a speech frame may be evaluated with a partial distance elimination method. If the partial distance of a Gaussian is already too small, the Gaussian will have a zero posterior value. The partial distance may be calculated by sequentially adding individual dimensions in a group of dimensions. The partial distance elimination occurs when less than all of the dimensions in the group are sequentially added.

Abstract translation: 具有部分距离消除的高维后验特征可用于语音识别。需要大量高斯的对数似然值来产生高维后验特征。具有非常小的对数似然性的高斯与零后验值相关联。用于语音帧的高斯的对数可能性可以用部分距离消除方法来评估。如果高斯的部分距离已经太小，则高斯将具有零后验值。可以通过在一组维度中依次添加个体维度来计算部分距离。当小于组中的所有维度被顺序地添加时，发生部分距离消除。

4.

发明申请
RECOGNITION ARCHITECTURE FOR GENERATING ASIAN CHARACTERS 审中-公开
Title translation: 用于生成亚洲字符的识别结构

公开(公告)号：WO2008134208A1

公开(公告)日：2008-11-06

申请号：PCT/US2008/059688

申请日：2008-04-08

Applicant: MICROSOFT CORPORATION

Inventor： KUO, Shiun-Zu , FEIGE, Kevin, E. , GONG, Yifan , MIWA, Taro , CHITRAPU, Arun

IPC: G06F17/28

CPC classification number: G06F17/273 , G06F17/2223

Abstract: Architecture for correcting incorrect recognition results in an Asian language speech recognition system. A spelling mode can be launched in response to receiving speech input, the spelling mode for correcting incorrect spelling of the recognition results or generating new words. Correction can be obtained using speech and/or manual selection and entry. The architecture facilitates correction in a single pass, rather than multiples times as in conventional systems. Words corrected using the spelling mode are corrected as a unit and treated as a word. The spelling mode applies to languages of at least the Asian continent, such as Simplified Chinese, Traditional Chinese, and/or other Asian languages such as Japanese.

Abstract translation: 用于校正不正确识别的架构导致亚洲语言语音识别系统。响应于接收语音输入，用于校正识别结果的不正确拼写或生成新单词的拼写模式，可以启动拼写模式。可以使用语音和/或手动选择和输入获得校正。该架构有助于在单次通过中进行校正，而不是如常规系统中的倍数更新。使用拼写模式更正的单词将作为单位进行更正并将其视为单词。拼写模式适用于至少亚洲大陆的语言，如简体中文，繁体中文和/或其他亚洲语言如日语。

5.

发明公开
RECOGNITION ARCHITECTURE FOR GENERATING ASIAN CHARACTERS 审中-公开
Title translation: 识别架构于产生亚洲字符

公开(公告)号：EP2153352A1

公开(公告)日：2010-02-17

申请号：EP08745321.3

申请日：2008-04-08

Applicant: Microsoft Corporation

Inventor： KUO, Shiun-Zu , FEIGE, Kevin, E. , GONG, Yifan , MIWA, Taro , CHITRAPU, Arun

IPC: G06F17/28

Abstract: Architecture for correcting incorrect recognition results in an Asian language speech recognition system. A spelling mode can be launched in response to receiving speech input, the spelling mode for correcting incorrect spelling of the recognition results or generating new words. Correction can be obtained using speech and/or manual selection and entry. The architecture facilitates correction in a single pass, rather than multiples times as in conventional systems. Words corrected using the spelling mode are corrected as a unit and treated as a word. The spelling mode applies to languages of at least the Asian continent, such as Simplified Chinese, Traditional Chinese, and/or other Asian languages such as Japanese.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification