US 11,893,632 B2
Systems and methods for determining financial security risks using self-supervised natural language extraction
Minnie Virk, Jersey City, NJ (US); Rohan Mehta, Brooklyn, NY (US); Alberto Silva, Brooklyn, NY (US); Anthony Shewnarain, Valley Stream, NY (US); Steven Freeman, Cranford, NJ (US); Stephen Jurcsek, Jersey City, NJ (US); Leah Lewy, Jersey City, NJ (US); and Ross Arkin, Brooklyn, NY (US)
Assigned to CAPITAL ONE SERVICES, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Jan. 6, 2021, as Appl. No. 17/143,076.
Prior Publication US 2022/0215467 A1, Jul. 7, 2022
Int. Cl. G06Q 40/03 (2023.01); G06N 20/00 (2019.01); G06F 40/279 (2020.01); G06Q 20/38 (2012.01); G06F 40/30 (2020.01)
CPC G06Q 40/03 (2023.01) [G06F 40/279 (2020.01); G06N 20/00 (2019.01); G06Q 20/382 (2013.01); G06F 40/30 (2020.01)] 14 Claims
OG exemplary drawing
 
1. A system for dynamic detection of security features based on self-supervised natural language extraction from unstructured data sets, comprising:
one or more processors;
a memory in communication with the one or more processors and storing instructions that, when executed by the processor, are configured to cause the system to:
retrieve a first unstructured data array indicative of a plurality of discrete user events, wherein retrieving the first unstructured data array comprises retrieving the first unstructured data array from a database;
converting the unstructured data array by serializing the unstructured data array based on at least one indicator to form one or more first data arrays, the one or more first data arrays each indicative of a discrete user event;
for each data entry pair of the one or more first data arrays, determine a vector representation and correlation value using a neural network implemented by the one or more processors;
generate a value for each data entry of the one or more first data arrays based on the determined vector representations and correlation values, wherein the value for each data entry is determined at least in part by accessing the database to find the respective discrete user event for each data entry to determine whether the value represents a desired event with a positive value or a hazardous event with a negative value;
determine one or more second data arrays corresponding to a subset of the one or more first data arrays based on selecting a plurality of highest value data entries from the one or more first data arrays using the neural network, wherein selecting the plurality of highest value data entries includes dynamically generating narratives associated with each respective discrete user event of the one or more first data arrays and storing the narratives as the one or more second data arrays in the database;
determine a security weight for each discrete user event based on the one or more second data arrays and the at least one indicator associated with a respective discrete user event, the at least one indicator comprising a timestamp;
determine an associated sentiment score for each security weight;
determine a security score based on a weighted average of the sentiment score and the security weight;
when the security score of any discrete user event exceeds a predetermined threshold, execute one or more security actions, wherein at least one of the one or more security actions comprises generating an indication that the security score of a user account associated with the discrete user events exceeds the predetermined threshold;
compute one or more third data arrays from the one or more second data arrays using the neural network by providing the one or more second data arrays as input to the neural network;
iteratively calculate an error measurement between the one or more third data arrays and the one or more first data arrays; and
iteratively modify one or more weights of one or more layers of the neural network based on the calculated error measurement.