Invention Grant
US08214365B1 Measuring confidence of file clustering and clustering based file classification 有权
测量文件聚类和基于聚类的文件分类的置信度

Measuring confidence of file clustering and clustering based file classification
Abstract:
A uniformity of a cluster of samples is determined, and a corresponding raw confidence value is calculated. A confidence interval weight is calculated using a confidence interval to determine reliability of the uniformity. A trace length weight is calculated, as a function of traces of the samples. An n-gram weight is calculated, as a function of numbers of n-grams generated by the samples. A compactness weight is calculated, as a function of the similarity of the samples. A cluster weight is calculated as a function of the four above-described weights. A cluster confidence measurement is calculated as a function of the cluster weight and the raw confidence value. When a new sample is assigned to the cluster, an assignment confidence measurement is calculated, as a function of the cluster's confidence measurement and the sample's trace length, n-grams and similarity.
Information query
Patent Agency Ranking
0/0