Data generalization for predictive models
Abstract:
A method, apparatus and a product for data generalization for predictive models. The method comprising: based on a labeled dataset, determining a plurality of buckets, each of which has an associated label; determining a plurality of clusters, grouping similar instances in the same bucket; based on the plurality of clusters, determining an alternative set of features comprising a set of generalized features, wherein each generalized feature corresponds to a cluster of the plurality of clusters, wherein a generalized feature that corresponds to a cluster is indicative of the instance being mapped to the corresponding cluster; obtaining a second instance; determining a generalized second instance that comprises a valuation of the alternative set of features for the second instance; and based on the generalized second instance, determining a label for the second instance.
Public/Granted literature
Information query
Patent Agency Ranking
0/0