US 12,393,727 B2
Distance preserving hash method
Abdelrahman Ali Almahmoud, Abu Dhabi (AE); Ernesto Damiani, Abu Dhabi (AE); Hadi Otrok, Abu Dhabi (AE); and Yousof Ali Alhammadi, Abu Dhabi (AE)
Assigned to KHALIFA UNIVERSITY OF SCIENCE AND TECHNOLOGY, Abu Dhabi (AE); BRITISH TELECOMMUNICATIONS PLC, London (GB); and EMIRATES TELECOMMUNICATIONS CORPORATION, Abu Dhabi (AE)
Appl. No. 17/601,145
Filed by KHALIFA UNIVERSITY OF SCIENCE AND TECHNOLOGY, Abu Dhabi (AE); BRITISH TELECOMMUNICATIONS PLC, London (GB); and EMIRATES TELECOMMUNICATIONS CORPORATION, Abu Dhabi (AE)
PCT Filed Apr. 3, 2019, PCT No. PCT/EP2019/058428
§ 371(c)(1), (2) Date Oct. 4, 2021,
PCT Pub. No. WO2020/200447, PCT Pub. Date Oct. 8, 2020.
Prior Publication US 2022/0215126 A1, Jul. 7, 2022
Int. Cl. G06F 21/62 (2013.01); G06F 18/22 (2023.01); G06F 21/60 (2013.01)
CPC G06F 21/6254 (2013.01) [G06F 18/22 (2023.01); G06F 21/602 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A computer-implemented method of preparing an anonymised dataset for use in data analytics, the method including the steps of:
(a) labelling elements of a dataset to be analysed according to a labelling scheme;
(b) selecting a subsample from the dataset and deriving therefrom an accuracy threshold indicative of the distance between elements of data within the subsample;
(c) deriving, from the anonymised dataset, an estimated accuracy of distance measurement between elements of the anonymised dataset and comparing this estimated accuracy to the accuracy threshold;
(d) selecting one or more labelled elements of the dataset to be replaced with a distance preserving hash; and for each selected element:
(e) partitioning a data plane including the selected element into a plurality of channels, each channel covering a different distance space of the data plane;
(f) hashing, using a cryptographic hash, data associated with the channel of the data plane in which the selected element resides, to form the distance preserving hash; and
(g) replacing the selected element with the distance preserving hash.