CPC G06F 40/284 (2020.01) [G06F 18/2321 (2023.01); G06F 18/2415 (2023.01); G06F 40/30 (2020.01)] | 20 Claims |
8. A method comprising:
receiving, by a computer system, a request from a user-agent application of a device to access at least one resource associated with a service provider system;
extracting, from the request, an identifier of the user-agent application, wherein the identifier comprises a character string;
generating, from the character string, a plurality of character n-grams based on a plurality of word sizes, wherein the plurality of character n-grams comprises a first set of character n-grams corresponding to a first word size and a second set of character n-grams corresponding to a second word size;
determining a plurality of hash values based on performing one or more hash functions on the plurality of character n-grams;
converting, by the computer system, the character string into a numerical data vector representation of the user-agent application based on the plurality of hash values, wherein the converting comprises transforming each of the plurality of hash values into a numerical value within the numerical data vector representation of the user-agent application;
calculating, by the computer system and for the user-agent application, a predictive score based on the numerical data vector representation, wherein the predictive score indicates whether the identifier of the user-agent application corresponds to an anomaly based on a probability distribution function that models patterns learned from historic data associated with a plurality of user-agent applications that have requested access to the at least one resource associated with the service provider system;
comparing, by the computer system, the predictive score to a threshold; and
based on the comparing, classifying, by the computer system, the user-agent application as non-fraudulent or fraudulent.
|