Abstract:
Higher dimensionality data is subject to a hierarchical visualization to allow the complete data set to be visualized in a top-down hierarchy in terms of clusters and sub-clusters at deeper levels. The data set is subject to standard finite normal mixture models and probabilistic principal component projections, the parameters of which are estimated using the expectation-maximization and principal component analysis under the Akaike Information Criteria (AIC) and the Minimum Description Length (MDL) criteria. The high-dimension raw data is subject to processing using principal component analysis to reveal the dominant distribution of the data at a first level. Thereafter, the so-processed information is further processed to reveal sub-clusters within the primary clusters. The various clusters and sub-clusters at the various hierarchical levels are subject to visual projection to reveal the underlying structure. The inventive schema has utility in all applications in which high-dimensionality multi-variate data is to be reduced to a two- or theree-dimensional projection space to allow visual exploration of the underlying structure of the data set.