Invention Grant
- Patent Title: Data classification and hierarchical clustering
- Patent Title (中): 数据分类和层次聚类
-
Application No.: US12602908Application Date: 2008-06-11
-
Publication No.: US08407164B2Publication Date: 2013-03-26
- Inventor: Hassan Haider Malik , John Ronald Kender
- Applicant: Hassan Haider Malik , John Ronald Kender
- Applicant Address: US NY New York
- Assignee: The Trustees of Columbia University in the City of New York
- Current Assignee: The Trustees of Columbia University in the City of New York
- Current Assignee Address: US NY New York
- Agency: Schwegman Lundberg Woessner P.A.
- International Application: PCT/US2008/007308 WO 20080611
- International Announcement: WO2008/154029 WO 20081218
- Main IPC: G06F15/18
- IPC: G06F15/18

Abstract:
Apparatus, systems, and methods can operate to provide efficient data clustering, data classification, and data compression. A method comprises training set of training instances can be processed to select a subset of size-1 patterns, initialize a weight of each size-1 pattern, include the size-1 patterns in classes in a model associated with the training set, and then include a set of top-k size-2 patterns in a way that provides an effective balance between local, class, and global significance patterns. A method comprises processing a dataset to compute an overall significance value of each size-2 pattern in each instance in the dataset, sort the size-2 patterns, and select the top-k size-2 patterns to be represented in clusters, which can be refined into a clustered hierarchy. A method comprises creating an uncompressed bitmap, reordering the bitmap, and compressing the bitmap. Additional apparatus, systems, and methods are disclosed.
Public/Granted literature
- US20100174670A1 DATA CLASSIFICATION AND HIERARCHICAL CLUSTERING Public/Granted day:2010-07-08
Information query