US 12,436,961 B2
Machine learning apparatus for data lineage transformation
Colm Fitzmaurice, Winona, MN (US)
Assigned to Bank of America Corporation, Charlotte, NC (US)
Filed by Bank of America Corporation, Charlotte, NC (US)
Filed on Nov. 7, 2023, as Appl. No. 18/387,710.
Prior Publication US 2025/0147747 A1, May 8, 2025
Int. Cl. G06F 16/24 (2019.01); G06F 16/21 (2019.01); G06F 16/2458 (2019.01)
CPC G06F 16/2468 (2019.01) [G06F 16/219 (2019.01)] 20 Claims
OG exemplary drawing
 
1. Apparatus for leveraging machine learning (“ML”) to selectively update stored code with updated code, the apparatus comprising:
machine readable memory configured to store a plurality of technical data element identifiers (“TDEIs”);
a computer configured to receive a query for data lineage information corresponding to a first TDEI associated with a first data lineage source;
a processor configured to leverage ML to:
identify a level of commonality between the first TDEI and a second TDEI, wherein the level of commonality is greater than both a threshold length percentage and a threshold alpha-numerical matching percentage;
identify data lineage information for the second TDEI corresponding to a second data lineage source;
determine whether the first TDEI and the second TDEI share a threshold level of commonality, wherein the threshold level of commonality is greater than a threshold number of alpha-numerical matches associated with both the first TDEI and the second TDEI;
following a determination that the first TDEI and the second TDEI share a level of commonality that is greater than the threshold level of commonality, identify one or more mismatches between the first TDEI and the second TDEI, a mismatch being a difference in an alpha-numerical symbol in both the first TDEI and the second TDEI; and
determine whether a threshold number of mismatches exists between the first TDEI and the second TDEI; and
an electronic switch configured to:
in response to a determination that a threshold number of mismatches does not exist between the first TDEI and the second TDEI, running a script to access a plurality of data files and to overwrite data lineage information in the data files, and generating a display on a graphical user interface (“GUI”) to prompt a user to overwrite the first TDEI with the second TDEI in a storage location containing the first TDEI, wherein the overwriting accelerates the running of the script and reduces a bandwidth of the computer;
in response to a determination that a threshold number of mismatches exists between the first TDEI and the second TDEI, maintain the second TDEI in the storage location containing the first TDEI; and
in response to a failure of the electronic switch to output a determination regarding a threshold number of mismatches existing between the first TDEI and the second TDEI, taking remedial action by generating a display on the GUI to prompt the user to input a new search query.