US 12,147,447 B2
Systems and methods for formatting data using a recurrent neural network
Anh Truong, Champaign, IL (US); Reza Farivar, Champaign, IL (US); Austin Walters, Savoy, IL (US); and Jeremy Goodsitt, Champaign, IL (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Jun. 23, 2023, as Appl. No. 18/340,166.
Application 18/340,166 is a continuation of application No. 17/833,147, filed on Jun. 6, 2022, granted, now 11,727,031.
Application 17/833,147 is a continuation of application No. 17/078,775, filed on Oct. 23, 2020, granted, now 11,360,996, issued on Jun. 14, 2022.
Application 17/078,775 is a continuation of application No. 16/810,230, filed on Mar. 5, 2020, granted, now 10,853,385, issued on Dec. 1, 2020.
Prior Publication US 2023/0334063 A1, Oct. 19, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 16/25 (2019.01); G06F 16/901 (2019.01); G06F 17/18 (2006.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01)
CPC G06F 16/258 (2019.01) [G06F 16/9024 (2019.01); G06F 17/18 (2013.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A system for formatting data, the system comprising:
at least one memory storing instructions; and
one or more processors configured to execute the instructions to perform operations comprising:
generating a first probabilistic graph, the first probabilistic graph including a set of nodes corresponding to positions in received data value sequences, by iteratively:
determining conditional counts of occurrences of received data values at a subsequent node in the set of nodes, the conditional counts being based on counting instances of received data values at one or more preceding nodes in the set of nodes; and
determining conditional probabilities based on the conditional counts;
determining a similarity metric of a second probabilistic graph and the first probabilistic graph, the second probabilistic graph being generated by a machine learning model; and
training the machine learning model based on the similarity metric.