Geo-clustering for data de-identification
Abstract:
The present disclosure is related to a method of geo-clustering of data for de-identification of a dataset. The method includes generating a plurality of geoclusters based on a plurality of geocodes. The geocodes may include ZIP codes or postal codes. The method further includes identifying the geoclusters having the smallest population. The geocluster having the smallest population is iteratively merged with the nearest geocluster until a minimum population threshold is met. Once the smallest geocluster meets the minimum population threshold, the plurality of geoclusters can be used to cluster the geocodes within a dataset to be de-identified.
Information query
Patent Agency Ranking
0/0