US 12,093,753 B2
Method and system for synthetic generation of time series data
Austin Walters, Savoy, IL (US); Jeremy Goodsitt, Champaign, IL (US); Mark Watson, Sedona, AZ (US); Anh Truong, Champaign, IL (US); and Vincent Pham, Champaign, IL (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by CAPITAL ONE SERVICES, LLC, McLean, VA (US)
Filed on Apr. 20, 2020, as Appl. No. 16/852,795.
Application 16/852,795 is a continuation of application No. 16/454,041, filed on Jun. 26, 2019, granted, now 10,664,381.
Claims priority of provisional application 62/694,968, filed on Jul. 6, 2018.
Prior Publication US 2020/0250071 A1, Aug. 6, 2020
Int. Cl. G06F 16/00 (2019.01); G06F 8/71 (2018.01); G06F 9/54 (2006.01); G06F 11/36 (2006.01); G06F 16/22 (2019.01); G06F 16/242 (2019.01); G06F 16/2455 (2019.01); G06F 16/248 (2019.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06F 16/335 (2019.01); G06F 16/903 (2019.01); G06F 16/9032 (2019.01); G06F 16/9038 (2019.01); G06F 16/906 (2019.01); G06F 16/93 (2019.01); G06F 17/15 (2006.01); G06F 17/16 (2006.01); G06F 17/18 (2006.01); G06F 18/20 (2023.01); G06F 18/21 (2023.01); G06F 18/2115 (2023.01); G06F 18/214 (2023.01); G06F 18/22 (2023.01); G06F 18/23 (2023.01); G06F 18/24 (2023.01); G06F 18/2411 (2023.01); G06F 18/2415 (2023.01); G06F 18/40 (2023.01); G06F 21/55 (2013.01); G06F 21/60 (2013.01); G06F 21/62 (2013.01); G06F 30/20 (2020.01); G06F 40/117 (2020.01); G06F 40/166 (2020.01); G06F 40/20 (2020.01); G06N 3/04 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/06 (2006.01); G06N 3/08 (2023.01); G06N 3/088 (2023.01); G06N 5/00 (2023.01); G06N 5/02 (2023.01); G06N 5/04 (2023.01); G06N 7/00 (2023.01); G06N 7/01 (2023.01); G06N 20/00 (2019.01); G06Q 10/04 (2023.01); G06T 7/194 (2017.01); G06T 7/246 (2017.01); G06T 7/254 (2017.01); G06T 11/00 (2006.01); G06V 10/70 (2022.01); G06V 10/98 (2022.01); G06V 30/194 (2022.01); G06V 30/196 (2022.01); H04L 9/40 (2022.01); H04L 67/00 (2022.01); H04L 67/306 (2022.01); H04N 21/234 (2011.01); H04N 21/81 (2011.01)
CPC G06F 9/541 (2013.01) [G06F 8/71 (2013.01); G06F 9/54 (2013.01); G06F 9/547 (2013.01); G06F 11/3608 (2013.01); G06F 11/3628 (2013.01); G06F 11/3636 (2013.01); G06F 16/2237 (2019.01); G06F 16/2264 (2019.01); G06F 16/2423 (2019.01); G06F 16/24568 (2019.01); G06F 16/248 (2019.01); G06F 16/254 (2019.01); G06F 16/258 (2019.01); G06F 16/283 (2019.01); G06F 16/285 (2019.01); G06F 16/288 (2019.01); G06F 16/335 (2019.01); G06F 16/90332 (2019.01); G06F 16/90335 (2019.01); G06F 16/9038 (2019.01); G06F 16/906 (2019.01); G06F 16/93 (2019.01); G06F 17/15 (2013.01); G06F 17/16 (2013.01); G06F 17/18 (2013.01); G06F 18/2115 (2023.01); G06F 18/214 (2023.01); G06F 18/2148 (2023.01); G06F 18/217 (2023.01); G06F 18/2193 (2023.01); G06F 18/22 (2023.01); G06F 18/23 (2023.01); G06F 18/24 (2023.01); G06F 18/2411 (2023.01); G06F 18/2415 (2023.01); G06F 18/285 (2023.01); G06F 18/40 (2023.01); G06F 21/552 (2013.01); G06F 21/60 (2013.01); G06F 21/6245 (2013.01); G06F 21/6254 (2013.01); G06F 30/20 (2020.01); G06F 40/117 (2020.01); G06F 40/166 (2020.01); G06F 40/20 (2020.01); G06N 3/04 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/06 (2013.01); G06N 3/08 (2013.01); G06N 3/088 (2013.01); G06N 5/00 (2013.01); G06N 5/02 (2013.01); G06N 5/04 (2013.01); G06N 7/00 (2013.01); G06N 7/01 (2023.01); G06N 20/00 (2019.01); G06Q 10/04 (2013.01); G06T 7/194 (2017.01); G06T 7/246 (2017.01); G06T 7/248 (2017.01); G06T 7/254 (2017.01); G06T 11/001 (2013.01); G06V 10/768 (2022.01); G06V 10/993 (2022.01); G06V 30/194 (2022.01); G06V 30/1985 (2022.01); H04L 63/1416 (2013.01); H04L 63/1491 (2013.01); H04L 67/306 (2013.01); H04L 67/34 (2013.01); H04N 21/23412 (2013.01); H04N 21/8153 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system for generating synthetic data, comprising:
one or more memory units storing instructions; and
one or more processors that execute the instructions to perform operations comprising:
receiving a request to generate a synthetic dataset, wherein the request indicates a desired method of data transformation and one or more of a desired statistical measure of the synthetic dataset and a desired data schema of the synthetic dataset;
receiving a dataset comprising time series data;
retrieving a data model based on the request, wherein the data model is trained to generate synthetic data based on a machine-learned relationship between data of at least two dimensions of a transformed dataset;
transforming the dataset by performing a first data transformation to at least a portion of the dataset, the first data transformation comprising:
at least one of an encoding method, a normalization method, or a time-based data processing method; and
a subtraction method on at least one dimension of the dataset;
generating, based on the desired data schema of the synthetic dataset or the desired statistical measure of the synthetic dataset, a synthetic transformed dataset by applying the trained data model to the transformed dataset;
generating the synthetic dataset by performing a second data transformation to the synthetic transformed dataset; and
providing the synthetic dataset for storage in a dataset database.