US 12,130,938 B2
Data product release method or system
Charles Codman Cabot, Cambridge (GB); Kieron Francois Pascal Guinamard, Cambridge (GB); Jason Derek McFall, Cambridge (GB); Pierre-Andre Maugis, Cambridge (GB); Hector Page, Cambridge (GB); Benjamin Thomas Pickering, Cambridge (GB); Theresa Stadler, Cambridge (GB); Jo-anne Tay, Cambridge (GB); and Suzanne Weller, Cambridge (GB)
Assigned to PRIVITAR LIMITED, Cambridge (GB)
Appl. No. 16/955,542
Filed by PRIVITAR LIMITED, Cambridge (GB)
PCT Filed Dec. 18, 2018, PCT No. PCT/GB2018/053666
§ 371(c)(1), (2) Date Jun. 18, 2020,
PCT Pub. No. WO2019/122854, PCT Pub. Date Jun. 27, 2019.
Claims priority of application No. 1721189 (GB), filed on Dec. 18, 2017; and application No. 1814105 (GB), filed on Aug. 30, 2018.
Prior Publication US 2021/0012028 A1, Jan. 14, 2021
Int. Cl. G06F 21/62 (2013.01); G06F 9/54 (2006.01); G06F 17/12 (2006.01); G06F 17/18 (2006.01); G06F 21/57 (2013.01)
CPC G06F 21/6245 (2013.01) [G06F 9/547 (2013.01); G06F 17/12 (2013.01); G06F 17/18 (2013.01); G06F 21/577 (2013.01); G06F 21/6227 (2013.01)] 62 Claims
OG exemplary drawing
 
1. A computer implemented data product release method, the method comprising the steps of;
deriving a data product release from a sensitive dataset using a differentially private system, wherein the data product release is a bounded or fixed set of statistics that is (a) predefined by a data holder and (b) derived from the sensitive dataset using the differentially private system, wherein the sensitive dataset includes raw data;
configuring, by the data holder, a prioritization of statistics in the set of statistics, wherein the statistics comprise one or more of a sum, count, average, median, min, or max;
configuring privacy protection parameters of the differentially private system as part of the data product release method to alter the balance between maintaining privacy of the sensitive dataset and making the data product release useful;
wherein the privacy protection parameters comprise a privacy protection parameter epsilon;
automatically determining a distribution of noise values to be added to the set of statistics by applying multiple different attacks to the set of statistics and by taking into account the prioritization of statistics configured by the data holder;
deriving the set of statistics from the sensitive dataset without providing access to any one or more of the raw data or raw data values within the sensitive dataset; and
directly calculating the privacy protection parameter epsilon from attack characteristics to get the desired attack success.