US 11,860,827 B2
System and method for identifying business logic and data lineage with machine learning
Jagmohan Singh, Coppell, TX (US); Suneetha Vakalapudi, Garnet Valley, PA (US); and Soren Tannehill, Lewis Center, OH (US)
Assigned to JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed by JPMorgan Chase Bank, N.A., New York, NY (US)
Filed on Sep. 7, 2021, as Appl. No. 17/467,938.
Application 17/467,938 is a continuation of application No. 16/115,968, filed on Aug. 29, 2018, granted, now 11,138,157.
Claims priority of provisional application 62/551,923, filed on Aug. 30, 2017.
Prior Publication US 2021/0406222 A1, Dec. 30, 2021
Int. Cl. G06F 16/00 (2019.01); G06F 16/178 (2019.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01); G06F 16/25 (2019.01)
CPC G06F 16/1794 (2019.01) [G06F 16/252 (2019.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A system that generates pseudo code that represents data logic from a source system to a target system, the system comprising:
a computer server comprising a programmed computer processor configured to perform the steps of:
preprocessing source data using direct SQL and creating a comma separated values (CSV) file with header columns and target columns;
processing the CSV file using dataframes;
identifying a set of best source feature attributes using recursive feature elimination method in machine learning;
separating the attributes to continuous and categorical columns;
feeding the attributes to a machine learning algorithm; and
generating a descriptive tree path in pseudo code.