US 12,067,571 B2
Systems and methods for generating models for classifying imbalanced data
Amitabha Das, Naperville, IL (US); Akhil Sajitha Sreehari, Chicago, IL (US); Tianyue Mao, Palo Alto, CA (US); and Kexuan Zou, Stamford, CT (US)
Assigned to SYNCHRONY BANK, Stamford, CT (US)
Filed by Synchrony Bank, Stamford, CT (US)
Filed on Mar. 10, 2021, as Appl. No. 17/197,290.
Claims priority of provisional application 62/988,305, filed on Mar. 11, 2020.
Prior Publication US 2021/0287136 A1, Sep. 16, 2021
Int. Cl. G06Q 20/40 (2012.01); G06F 18/2413 (2023.01); G06N 20/00 (2019.01); G06Q 20/20 (2012.01); G06Q 20/38 (2012.01)
CPC G06Q 20/4016 (2013.01) [G06F 18/2413 (2023.01); G06N 20/00 (2019.01); G06Q 20/20 (2013.01); G06Q 20/382 (2013.01)] 24 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving a request to identify a classification model from a set of classification models, wherein the request specifies a set of classification algorithms and a set of sampling algorithms, wherein the request includes a data set including first data associated with a first characteristic and second data associated with a second characteristic, and wherein the request specifies one or more metrics for evaluating performance of the set of classification models;
using the set of classification algorithms and the set of sampling algorithms in combination to generate the set of classification models;
using the set of classification models to generate a set of classifications, wherein a classification of the set of classifications includes classifying the first data into majority data based on the first characteristic and the second data into minority data based on the second characteristic;
determining the performance of the set of classification models based on the set of classifications and according to the one or more metrics;
selecting the classification model, wherein the classification model is selected based on the performance of the set of classification models according to the one or more metrics; and
providing the classification model, a classification generated by the classification model using the data set, and a summary of the performance of the set of classification models according to the one or more metrics.