US 11,989,597 B2
Dataset connector and crawler to identify data lineage and segment data
Austin Walters, Savoy, IL (US); Mark Watson, Urbana, IL (US); Galen Rafferty, Mahomet, IL (US); Anh Truong, Champaign, IL (US); Jeremy Goodsitt, Champaign, IL (US); and Vincent Pham, Champaign, IL (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by CAPITAL ONE SERVICES, LLC, McLean, VA (US)
Filed on Oct. 20, 2021, as Appl. No. 17/505,840.
Application 17/505,840 is a continuation of application No. 16/577,010, filed on Sep. 20, 2019, granted, now 11,182,223.
Application 16/577,010 is a continuation of application No. 16/251,867, filed on Jan. 18, 2019, granted, now 10,459,954, issued on Oct. 29, 2019.
Claims priority of provisional application 62/694,968, filed on Jul. 6, 2018.
Prior Publication US 2022/0083402 A1, Mar. 17, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 8/71 (2018.01); G06F 9/54 (2006.01); G06F 11/36 (2006.01); G06F 16/22 (2019.01); G06F 16/242 (2019.01); G06F 16/2455 (2019.01); G06F 16/248 (2019.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06F 16/335 (2019.01); G06F 16/903 (2019.01); G06F 16/9032 (2019.01); G06F 16/9038 (2019.01); G06F 16/906 (2019.01); G06F 16/93 (2019.01); G06F 17/15 (2006.01); G06F 17/16 (2006.01); G06F 17/18 (2006.01); G06F 18/20 (2023.01); G06F 18/21 (2023.01); G06F 18/2115 (2023.01); G06F 18/213 (2023.01); G06F 18/214 (2023.01); G06F 18/22 (2023.01); G06F 18/23 (2023.01); G06F 18/24 (2023.01); G06F 18/2411 (2023.01); G06F 18/2415 (2023.01); G06F 18/40 (2023.01); G06F 21/55 (2013.01); G06F 21/60 (2013.01); G06F 21/62 (2013.01); G06F 30/20 (2020.01); G06F 40/117 (2020.01); G06F 40/166 (2020.01); G06F 40/20 (2020.01); G06N 3/04 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/06 (2006.01); G06N 3/08 (2023.01); G06N 3/088 (2023.01); G06N 5/00 (2023.01); G06N 5/02 (2023.01); G06N 5/04 (2023.01); G06N 7/00 (2023.01); G06N 7/01 (2023.01); G06N 20/00 (2019.01); G06Q 10/04 (2023.01); G06T 7/194 (2017.01); G06T 7/246 (2017.01); G06T 7/254 (2017.01); G06T 11/00 (2006.01); G06V 10/70 (2022.01); G06V 10/98 (2022.01); G06V 30/194 (2022.01); G06V 30/196 (2022.01); H04L 9/40 (2022.01); H04L 67/00 (2022.01); H04L 67/306 (2022.01); H04N 21/234 (2011.01); H04N 21/81 (2011.01)
CPC G06F 9/541 (2013.01) [G06F 8/71 (2013.01); G06F 9/54 (2013.01); G06F 9/547 (2013.01); G06F 11/3608 (2013.01); G06F 11/3628 (2013.01); G06F 11/3636 (2013.01); G06F 16/2237 (2019.01); G06F 16/2264 (2019.01); G06F 16/2423 (2019.01); G06F 16/24568 (2019.01); G06F 16/248 (2019.01); G06F 16/254 (2019.01); G06F 16/258 (2019.01); G06F 16/283 (2019.01); G06F 16/285 (2019.01); G06F 16/288 (2019.01); G06F 16/335 (2019.01); G06F 16/90332 (2019.01); G06F 16/90335 (2019.01); G06F 16/9038 (2019.01); G06F 16/906 (2019.01); G06F 16/93 (2019.01); G06F 17/15 (2013.01); G06F 17/16 (2013.01); G06F 17/18 (2013.01); G06F 18/2115 (2023.01); G06F 18/213 (2023.01); G06F 18/214 (2023.01); G06F 18/2148 (2023.01); G06F 18/217 (2023.01); G06F 18/2193 (2023.01); G06F 18/22 (2023.01); G06F 18/23 (2023.01); G06F 18/24 (2023.01); G06F 18/2411 (2023.01); G06F 18/2415 (2023.01); G06F 18/285 (2023.01); G06F 18/40 (2023.01); G06F 21/552 (2013.01); G06F 21/60 (2013.01); G06F 21/6245 (2013.01); G06F 21/6254 (2013.01); G06F 30/20 (2020.01); G06F 40/117 (2020.01); G06F 40/166 (2020.01); G06F 40/20 (2020.01); G06N 3/04 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/06 (2013.01); G06N 3/08 (2013.01); G06N 3/088 (2013.01); G06N 5/00 (2013.01); G06N 5/02 (2013.01); G06N 5/04 (2013.01); G06N 7/00 (2013.01); G06N 7/01 (2023.01); G06N 20/00 (2019.01); G06Q 10/04 (2013.01); G06T 7/194 (2017.01); G06T 7/246 (2017.01); G06T 7/248 (2017.01); G06T 7/254 (2017.01); G06T 11/001 (2013.01); G06V 10/768 (2022.01); G06V 10/993 (2022.01); G06V 30/194 (2022.01); G06V 30/1985 (2022.01); H04L 63/1416 (2013.01); H04L 63/1491 (2013.01); H04L 67/306 (2013.01); H04L 67/34 (2013.01); H04N 21/23412 (2013.01); H04N 21/8153 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A client device comprising:
one or more memory units storing instructions; and
one or more processors that execute the instructions to perform operations comprising:
receiving, at the client device, an input comprising a command to segment an actual dataset;
transmitting, to a dataset connector system, a request to segment the actual dataset, the dataset connector system being configured to:
generate, by a data mapping model, a plurality of edges between the actual dataset and one or more synthetic datasets, the edges being based on at least one of a foreign key score based on a frequency of occurrence of a foreign key in the actual dataset and one or more synthetic datasets, the foreign key including an index comprising of a list of known foreign keys for estimating a probability of a data object being a unique foreign key and associated with the actual dataset, a data schema associated with the actual dataset, a hierarchical relationship associated with the actual dataset, or a statistical metric associated with the actual dataset; and
generate a segmented cluster associating the actual dataset with the one or more synthetic datasets based on the generated edges;
receiving the cluster from the dataset connector system; and
displaying a graphical representation of the cluster at the client device.