| CPC G06F 16/3329 (2019.01) [G06F 16/22 (2019.01); G06F 16/2228 (2019.01); G06F 16/2477 (2019.01); G06F 16/252 (2019.01); G06F 16/285 (2019.01); G06F 16/316 (2019.01); G06F 16/338 (2019.01); G06F 16/367 (2019.01); G06F 40/20 (2020.01); G06N 3/006 (2013.01); G06N 7/01 (2023.01); G06N 20/00 (2019.01)] | 15 Claims |

|
1. A method for generating synthesized user data, comprising:
receiving a data specification schema that specifies characteristics of test data objects comprising synthetic user data, wherein the characteristics comprise a third-party data source that includes a natural language text generation algorithm;
generating the test data objects, the generating of each test data object including:
determining, from the data specification schema, a number of fields of the test data object to be populated, the fields representing categories of simulated user data, and
determining values for the fields, the values simulating user data, wherein determining the values comprises:
determining values for a first field by querying the third-party data source such that the third-party data source returns, responsive to the query, realistic synthetic user data used to populate the values for the at least one field comprising a sequence of a plurality of words, wherein the third-party data source is associated with an entity other than an entity generating the test data objects,
determining values for a third field by generating a value associated with a second field and determining the value for the third field based on the generated value associated with the second field, wherein the data specification schema indicates that the third field is to have a value that is a function of the value associated with the second field, and
determining values for a fourth field by querying a second third-party data source that is a database, wherein the data specification schema indicates a query provided to the database, and wherein a response to the query corresponds to the value associated with the fourth field;
storing the test data objects in a database, the storing of each test data object including populating the fields with the determined values;
generating a tabular data file including the test data objects; and
transmitting the tabular data file to a computing system, wherein the computing system is configured to utilize the tabular data file comprising the realistic synthetic user data during a user data testing procedure of an application performed by the computing system.
|