US 11,941,153 B2
De-identification method for big data
Won Suk Lee, Seoul (KR)
Assigned to BOALA CO., LTD., Seoul (KR)
Appl. No. 17/608,040
Filed by BOALA CO., LTD., Seoul (KR)
PCT Filed May 31, 2019, PCT No. PCT/KR2019/006586
§ 371(c)(1), (2) Date Nov. 1, 2021,
PCT Pub. No. WO2020/241943, PCT Pub. Date Dec. 3, 2020.
Prior Publication US 2022/0215128 A1, Jul. 7, 2022
Int. Cl. G06F 21/62 (2013.01)
CPC G06F 21/6254 (2013.01) 4 Claims
OG exemplary drawing
 
1. A de-identification processing method of big data performed in a data server having a communication unit, a processing unit and a storage unit, the de-identification processing method comprising:
storing, by the processing unit, data collected through the communication unit from a terminal connected through a wired/wireless network in the storage unit of the data server; and
a data abstraction step, by the processing unit, of generating a record different from original records by combining at least two records among the original records constituting the data, wherein
the data abstraction step includes:
setting at least one field among fields of the original record constituting the data as an abstraction reference field, and setting at least one field other than the abstraction reference field as an abstraction target field;
selecting at least every two (N) records having same abstraction reference field values among the original record as an abstraction target record group;
abstracting the selected N abstraction target record groups into one abstraction record including the abstraction reference field and the abstraction target field, in which a numerical attribute field of the abstraction record is allocated to include at least one value among statistical function values, and a category attribute field of the abstraction record is allocated as a connection-type attribute value including a corresponding category attribute value and an occurrence rate value of the corresponding category attribute value in the abstraction target record group;
selecting at least every two (M) records among records in which a number of records having all same values of the abstraction reference fields is less than N, as an abstraction target record group;
abstracting the selected M abstraction target record groups into one abstraction record including the abstraction reference field and the abstraction target field, in which a numerical attribute field of the abstraction record is allocated to include at least one value among statistical function values, and the category attribute field of the abstraction record is allocated as a connection-type attribute value including a corresponding category attribute value and an occurrence rate value of the corresponding category attribute value in the abstraction target record group; and
storing, by the processing unit, the abstraction record in the storage unit as a record of the abstract data.