Patent search ap:("Red Hat Page Inc.") AND inv:"Ronald Nowling"

1.

发明申请
METHOD FOR GENERATING SYNTHETIC DATA SETS AT SCALE WITH NON-REDUNDANT PARTITIONING 审中-公开

公开(公告)号：US20180107729A1

公开(公告)日：2018-04-19

申请号：US15294142

申请日：2016-10-14

Applicant: Red Hat, Inc.

Inventor： Jay Vyas , Ronald Nowling , Huamin Chen

IPC: G06F17/30 , G06N99/00

CPC classification number: G06F16/285 , G06N20/00

Abstract: An example system includes a first machine and a second machine, a clustering module, and a training module. The clustering module receives a plurality of data sets, each including attributes. The clustering module partitions the plurality of data sets into a first clustered data set and a second clustered data set. Each data set of the plurality of data sets is partitioned. The training module assigns a first stochastic model to the first clustered data set and a second stochastic model to the second clustered data set. The first machine selects the first clustered data set and the first stochastic model and generates a first synthetic data set having generated data for each one of the attributes. The second machine selects the second clustered data set and the second stochastic model and generates a second synthetic data set having generated data for each one of the attributes.

2.

发明授权
Method for generating synthetic data sets at scale with non-redundant partitioning 有权

公开(公告)号：US10891311B2

公开(公告)日：2021-01-12

申请号：US15294142

申请日：2016-10-14

Applicant: Red Hat, Inc.

Inventor： Jay Vyas , Ronald Nowling , Huamin Chen

IPC: G06F16/28 , G06N20/00 , G16H10/60 , G16H50/70 , G06N7/00

Abstract: An example system includes a first machine and a second machine, a clustering module, and a training module. The clustering module receives a plurality of data sets, each including attributes. The clustering module partitions the plurality of data sets into a first clustered data set and a second clustered data set. Each data set of the plurality of data sets is partitioned. The training module assigns a first stochastic model to the first clustered data set and a second stochastic model to the second clustered data set. The first machine selects the first clustered data set and the first stochastic model and generates a first synthetic data set having generated data for each one of the attributes. The second machine selects the second clustered data set and the second stochastic model and generates a second synthetic data set having generated data for each one of the attributes.

Patent Agency Ranking