Invention Grant
- Patent Title: Efficient data infrastructure for high dimensional data analysis
- Patent Title (中): 高维数据基础架构,用于高维数据分析
-
Application No.: US11818879Application Date: 2007-06-15
-
Publication No.: US07870114B2Publication Date: 2011-01-11
- Inventor: Haidong Zhang , Guowei Liu , Yantao Li , Bing Sun , Jian Wang
- Applicant: Haidong Zhang , Guowei Liu , Yantao Li , Bing Sun , Jian Wang
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Described is a technology by which high dimensional source data corresponding to rows of records with identifiers, and columns comprising dimensions of data values, are processed into a file model for efficient access. An inverted index corresponding to any dimension is built by mapping data from raw dimension values to mapped values based on mapping entries in a dimension table. The record identifiers are arranged into subgroups based on their mapped value; a count and/or an offset may be maintained for locating each of the subgroups. The raw values for a dimension are maintained within a raw value file. For sparse data, the raw value file may be compressed, e.g., by excluding nulls and associating a record identifier with each non-null. A data manager provides access to data in the data files, such as by offering various functions, using caching for efficiency.
Public/Granted literature
- US20080313213A1 Efficient data infrastructure for high dimensional data analysis Public/Granted day:2008-12-18
Information query