US 12,242,939 B2
	Method, system, and computer program product for synthetic oversampling for boosting supervised anomaly detection
Kwei-Herng Lai, Houston, TX (US); Lan Wang, Sunnyvale, CA (US); Huiyuan Chen, San Jose, CA (US); Mangesh Bendre, Sunnyvale, CA (US); Mahashweta Das, Campbell, CA (US); and Hao Yang, San Jose, CA (US)
Assigned to Visa International Service Association, San Francisco, CA (US)
Appl. No. 18/686,563
Filed by Visa International Service Association, San Francisco, CA (US)
PCT Filed Aug. 4, 2023, PCT No. PCT/IB2023/057912 § 371(c)(1), (2) Date Feb. 26, 2024, PCT Pub. No. WO2024/033771, PCT Pub. Date Feb. 15, 2024.
Claims priority of provisional application 63/397,719, filed on Aug. 12, 2022.
Prior Publication US 2024/0281718 A1, Aug. 22, 2024
Int. Cl. G06N 20/00 (2019.01); G06F 18/2413 (2023.01)

CPC G06N 20/00 (2019.01) [G06F 18/24147 (2023.01)]

15 Claims

1. A method, comprising:

obtaining, with at least one processor, a training dataset X^trainincluding a plurality of source samples including a plurality of labeled normal samples and a plurality of labeled anomaly samples;

executing, with the at least one processor, a training episode by:

(i) initializing a timestamp t;

(ii) receiving, from an actor network π of an actor critic framework including the actor network π and a critic network Q, an action vector a_tfor the timestamp t, wherein the actor network π is configured to generate the action vector a_tbased on a state s_t, wherein the state s_tis determined based on a current pair of source samples of the plurality of source samples, and wherein the action vector a_tincludes a size of a nearest neighborhood k, a composition ratio α, a number of oversampling n, and a termination probability ∈;

(iii) combining the current pair of source samples according to the composition ratio α and the number of oversampling n to generate a labeled synthetic sample x_synassociated with a label y_syn;

(iv) training, using the labeled synthetic sample x_synand the label y_syn, a machine learning classifier ϕ;

(v) obtaining, based on the size of a nearest neighborhood k, source samples in the k-nearest neighborhood of the labeled synthetic sample x_syn;

(vi) generating, with the machine learning classifier ϕ, for the source samples in the k-nearest neighborhood of the labeled synthetic sample x_synand a subset of the plurality of source samples of the training dataset X^trainin a validation dataset X^val, a plurality of classifier outputs;

(vii) selecting, from the source samples in the k-nearest neighborhood of the labeled synthetic sample x_syn, a next pair of source samples;

(viii) storing, in a memory buffer, the state s_t, the action vector a_t, a next state s_t+1, and a reward r_t, wherein the next state s_t+1is determined based on the next pair of source samples, and wherein the reward r_tis determined based on the plurality of classifier outputs;

(ix) determining whether the termination probability ∈ satisfies a termination threshold;

(x) in response to determining that the termination probability ∈ fails to satisfy the termination threshold, incrementing the timestamp t, for a number of training steps S:

training the critic network Q according to a critic loss function that depends on the state s_t, the action vector a_t, and the reward r_t; and

training the actor network π according to an actor loss function that depends on an output of the critic network, and

after training the actor network π and the critic network Q for the number of training steps S, returning to step (ii) with the next pair of source samples as the current pair of source samples;

(xi) in response to determining that the termination probability ∈ satisfies the termination threshold, determining whether the number of training episodes executed satisfies a threshold number of training episodes;

(xii) in response to determining that the number of training episodes executed fails to satisfy the threshold number of training episodes, return to step (i) to execute a next training episode; and

(xiii) in response to determining that the number of training episodes executed satisfies the threshold number of training episodes, provide the machine learning classifier ϕ, wherein the plurality of source samples is associated with a plurality of transactions in a transaction processing network, wherein the plurality of labeled normal samples is associated with a plurality of non-fraudulent transactions of the plurality of transactions, and wherein the plurality of labeled anomaly samples is associated with a plurality of fraudulent transactions of the plurality of transactions;

receiving, with the at least one processor, transaction data associated with a transaction currently being processed in the transaction processing network:

processing, with the at least one processor, using the trained machine learning classifier ϕ, the transaction data to classify the transaction as a fraudulent or non-fraudulent transaction; and

in response to classifying the transaction as a fraudulent transaction, denying, with the at least one processor, authorization of the transaction in the transaction processing network.