US 11,886,516 B1
Generating synthetic data based on time series predictions and plural machine learning models
Austin Walters, Savoy, IL (US); and Jeremy Goodsitt, Champaign, IL (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Sep. 12, 2022, as Appl. No. 17/931,166.
Int. Cl. G06F 16/00 (2019.01); G06F 16/906 (2019.01)
CPC G06F 16/906 (2019.01) 20 Claims
OG exemplary drawing
 
1. A method comprising:
receiving, by one or more computing devices, one or more data records that include transaction data of a user;
based on applying one or more classification techniques to the one or more data records, determining, by the one or more computing devices, a first category for the one or more data records, wherein the first category is of a plurality of categories;
determining, by the one or more computing devices and based on the one or more data records, data record segments, wherein each segment of the data record segments includes a time step of the transaction data;
based on the data record segments, determining, by the one or more computing devices, first time series model input;
using a first model and the first time series model input, determining, by the one or more computing devices, a first predicted time step of predicted transaction data, wherein the first predicted time step is associated with the first category, wherein the first model is configured to predict first time steps associated with the first category, and wherein the first model is one of a plurality of machine learning models configured to predict time steps associated with the plurality of categories;
determining, by the one or more computing devices, second time series model input that includes the first predicted time step;
using the plurality of machine learning models and the second time series model input, determining, by the one or more computing devices, second predicted time steps of predicted transaction data, wherein the second predicted time steps are associated with the plurality of categories;
based on the first predicted time step and the second predicted time steps, determining, by the one or more computing devices, a plurality of potential time series of predicted transaction data;
determining, by the one or more computing devices, confidence values for the plurality of potential time series; and
based on the confidence values and based on the plurality of potential time series, determining, by the one or more computing devices, a synthetic time series for the user, wherein the synthetic time series includes the first predicted time step and one of the second predicted time steps.