Invention Grant
- Patent Title: Parallel processing of data sets
- Patent Title (中): 并行处理数据集
-
Application No.: US12942736Application Date: 2010-11-09
-
Publication No.: US08868470B2Publication Date: 2014-10-21
- Inventor: Ning-Yi Xu , Feng-Hsiung Hsu , Feng Yan
- Applicant: Ning-Yi Xu , Feng-Hsiung Hsu , Feng Yan
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Lee & Hayes PLLC
- Agent Carole Boelitz; Micky Minhas
- Main IPC: G06F1/00
- IPC: G06F1/00 ; G06N5/00 ; G06F9/50

Abstract:
Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.
Public/Granted literature
- US20120117008A1 Parallel Processing Of Data Sets Public/Granted day:2012-05-10
Information query