Deep convex network with joint use of nonlinear random projection, Restricted Boltzmann Machine and batch-based parallelizable optimization

Invention Grant

US08489529B2 Deep convex network with joint use of nonlinear random projection, Restricted Boltzmann Machine and batch-based parallelizable optimization 有权

Title translation: 联合使用非线性随机投影的深凸网络，限制玻尔兹曼机器和基于批量的可并行化优化

Please log in to see more content

Patent Title: Deep convex network with joint use of nonlinear random projection, Restricted Boltzmann Machine and batch-based parallelizable optimization
Patent Title (中): 联合使用非线性随机投影的深凸网络，限制玻尔兹曼机器和基于批量的可并行化优化
Application No.: US13077978

Application Date: 2011-03-31
Publication No.: US08489529B2

Publication Date: 2013-07-16
Inventor: Li Deng , Dong Yu , Alejandro Acero
Applicant: Li Deng , Dong Yu , Alejandro Acero
Applicant Address: US WA Redmond
Assignee: Microsoft Corporation
Current Assignee: Microsoft Corporation
Current Assignee Address: US WA Redmond
Main IPC: G06N5/00
IPC: G06N5/00

Deep convex network with joint use of nonlinear random projection, Restricted Boltzmann Machine and batch-based parallelizable optimization

Abstract:

A method is disclosed herein that includes an act of causing a processor to access a deep-structured, layered or hierarchical model, called deep convex network, retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto. This layered model can produce the output serving as the scores to combine with transition probabilities between states in a hidden Markov model and language model scores to form a full speech recognizer. The method makes joint use of nonlinear random projections and RBM weights, and it stacks a lower module's output with the raw data to establish its immediately higher module. Batch-based, convex optimization is performed to learn a portion of the deep convex network's weights, rendering it appropriate for parallel computation to accomplish the training. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.

Abstract(Chinese):

本文公开了一种方法，其包括使处理器访问被保留在计算机可读介质中的称为深凸网络的深层结构的分层或层次模型的动作，其中深层结构模型包括多个具有分配给它的权重。该分层模型可以产生作为分数的输出，以与隐藏的马尔可夫模型和语言模型分数中的状态之间的转移概率相结合，以形成完整的语音识别器。该方法联合使用非线性随机投影和RBM权重，并将较低模块的输出与原始数据叠加以建立其立即更高的模块。执行基于批次的凸优化来学习深凸网络权重的一部分，使其适合于并行计算以完成训练。该方法还可以包括使用基于序列而不是一组不相关帧的优化准则共同基本优化深层结构模型的权重，转移概率和语言模型分数的动作。

Public/Granted literature

US20120254086A1 DEEP CONVEX NETWORK WITH JOINT USE OF NONLINEAR RANDOM PROJECTION, RESTRICTED BOLTZMANN MACHINE AND BATCH-BASED PARALLELIZABLE OPTIMIZATION Public/Granted day:2012-10-04

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N5/00	利用基于知识的模式的计算机系统