US 12,142,383 B2
Geo-clustering for data de-identification
Andrew Richard Baker, Alcove (CA); and Khaled El Emam, Ottawa (CA)
Assigned to Privacy Analytics Inc., Ontario (CA)
Filed by PRIVACY ANALYTICS INC., Ottawa (CA)
Filed on May 31, 2022, as Appl. No. 17/828,073.
Application 17/828,073 is a division of application No. 15/591,630, filed on May 10, 2017, granted, now 11,380,441.
Claims priority of provisional application 62/334,261, filed on May 10, 2016.
Prior Publication US 2022/0293280 A1, Sep. 15, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G16H 50/30 (2018.01); G06F 16/29 (2019.01); G16H 10/60 (2018.01); H04L 67/52 (2022.01)
CPC G16H 50/30 (2018.01) [G06F 16/29 (2019.01); H04L 67/52 (2022.05); G16H 10/60 (2018.01)] 19 Claims
OG exemplary drawing
 
1. An apparatus, comprising:
a processor and memory configured to:
receive data at the processor representing a plurality of clusters, wherein each cluster represents a geographic area having a centroid and a population of individuals;
merge a first cluster having a smallest population with a second cluster having a nearest centroid based on a determination that the population of the first cluster is below a minimum size threshold to form an updated cluster that represents an updated geographic area having an updated centroid and an updated population that reflects the merging of the first cluster and the second cluster;
re-perform the merge until each cluster meets a minimum size threshold,
de-identify data records associated with the individuals of the clusters when the smallest cluster meets the minimum size threshold,
assess a risk of re-identification of the de-identified clusters based on k-anonymity,
increase the minimum size threshold and re-perform the merge, the de-identify, and the assess a risk, based on a determination that the assessed risk does not meet a risk criterion, and
present the de-identified clusters on a display based on a determination that the assessed risk meets the risk criterion.