US 12,229,315 B2
	Machine learning model generating system with enhanced privacy protection, machine learning model generating method with enhanced privacy protection
Yukihisa Fujita, Tokyo-to (JP)
Assigned to TOYOTA JIDOSHA KABUSHIKI KAISHA, Toyota (JP)
Filed by TOYOTA JIDOSHA KABUSHIKI KAISHA, Toyota (JP)
Filed on Jul. 20, 2022, as Appl. No. 17/813,734.
Claims priority of application No. 2021-122700 (JP), filed on Jul. 27, 2021.
Prior Publication US 2023/0029851 A1, Feb. 2, 2023
Int. Cl. G06F 21/62 (2013.01); G06F 18/2415 (2023.01); H04L 9/40 (2022.01)

CPC G06F 21/6254 (2013.01) [G06F 18/2415 (2023.01)]

4 Claims

1. A system comprising:

a memory that storing a collection of personal information data and a data catalog of the collection of personal information data; and

a processing apparatus configured to execute:

acquiring designation of metadata in the data catalog;

acquiring a first data range determining a part of the collection of personal information data;

acquiring a machine learning logic;

generating a machine learning model according to the machine learning logic, based on personal information data corresponding to designated metadata and the first data range;

calculating a personal identification risk which shows a risk of a person being identified based on an output of the machine learning model; and

outputting the machine learning model when the personal identification risk does not exceed a predetermined threshold;

wherein

each of the collection of personal information data is corresponding to one or more pieces of ID information each of which shows a particular individual, and

the calculating the personal identification risk includes:

selecting input data from the collection of personal information data;

acquiring output data of the machine learning model by inputting the input data;

generating correspondence information which shows correspondence between the one or more pieces of ID information corresponded to the input data and the output data for the input data; and

calculating the personal identification risk based on the correspondence information;

the calculating the personal identification risk further includes:

classifying the output data into categories;

calculating, for each of combinations of the one or more pieces of ID information and the categories, number of the output data which falls into a category specified by a combination and corresponds in the correspondence information to ID information specified by the combination; and

calculating the personal identification risk according to the following formula:

where:

IR is the personal identification risk, (i, j) is one of the combinations, and u (i, j) is the number of the output data calculated regarding (i, j).