Invention Grant
- Patent Title: Partitioning of data mining training set
- Patent Title (中): 数据挖掘训练集分区
-
Application No.: US11371477Application Date: 2006-03-09
-
Publication No.: US07756881B2Publication Date: 2010-07-13
- Inventor: Ioan Bogdan Crivat , Raman S. Iyer , C. James MacLennan
- Applicant: Ioan Bogdan Crivat , Raman S. Iyer , C. James MacLennan
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Workman Nydegger
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30

Abstract:
A system that effectuates fetching a complete set of relational data into a mining services server and subsequently defining desired partitions upon the fetched data is provided. In accordance with the innovation, the data can be locally cached and partitioned therefrom. Accordingly, upon the same mining structure (e.g., cache) that has been partitioned, the novel innovation can build mining models for each partition. In other words, the innovation can employ the concept of mining structure as a data cache while manipulating only partitions of this cache in certain operations. The innovation can be employed in scenarios where a user wants to train a mining model using only data points that satisfy a particular Boolean condition, a user wants to split the training set into multiple partitions (e.g., training/testing) and/or a user wants to perform a data mining procedure known as “N-fold cross validation.”
Public/Granted literature
- US20070214135A1 Partitioning of data mining training set Public/Granted day:2007-09-13
Information query