| CPC G06F 21/6254 (2013.01) [G06F 16/258 (2019.01); G06F 16/285 (2019.01)] | 3 Claims |

|
3. A synthetic data generation method for execution by a synthetic data generation apparatus that includes storage and processing circuitry, the synthetic data generation method comprising:
a coding step in which the processing circuitry codes a value of each of category attributes contained in original data into a value of a numerical attribute in accordance with a coding rule which is stored in the storage and indicates correspondence between a code and a value of a category attribute;
a data formatting step in which the processing circuitry generates first synthetic data from the original data after coding using a synthetic data generation method for numerical attributes;
a conversion step in which, if the value of the numerical attribute which is contained in the first synthetic data and corresponds to the value of one of the category attributes exceeds a range of values that can be assumed by the value of that numerical attribute, the processing circuitry converts the value of that numerical attribute to a value included in the range of values that can be assumed by the value of that numerical attribute; and
a decoding step in which the processing circuitry decodes the value of the numerical attribute which is contained in the first synthetic data after conversion and which corresponds to the value of one of the category attributes to the value of that category attribute in accordance with the coding rule to obtain synthetic data, wherein
the value of the numerical attribute is a value that can be measured numerically, and the value of the category attribute is a value that cannot be measured numerically,
the coding rule is a 1-of-K coding method, and
the synthetic data maintains relationships among all the attributes in the original data, wherein
(i) the relationships are variance-covariance or (ii) the relationships are correlation.
|