US 11,755,536 B1
System-independent data lineage system
Yahor Pushkin, Redmond, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jan. 10, 2020, as Appl. No. 16/740,263.
Int. Cl. G06F 16/17 (2019.01); G06F 16/16 (2019.01); G06F 11/34 (2006.01); G06F 11/14 (2006.01); G06F 16/21 (2019.01)
CPC G06F 16/1734 (2019.01) [G06F 11/1464 (2013.01); G06F 11/3495 (2013.01); G06F 16/168 (2019.01); G06F 16/219 (2019.01); G06F 2201/84 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a plurality of computing devices, respectively comprising at least one processor and a memory, that implement a data lineage service as part of a provider network, the data lineage service configured to:
receive, via an interface for the data lineage service, one or more requests to describe the execution of respective revisions of one or more transformations of a data flow, the respective revisions of the one or more transformations performed on a revision of a dataset and the respective revisions of the one or more transformations associated with one or more data processors that executed the one or more transformations;
evaluate a graph modeling the data flow to identify respective child nodes of the graph representing different ones of the respective revisions of the one or more transformations that correspond to one or more parent nodes representing the one or more transformations; and
record, by the data lineage service, the performance of the one or more transformations as part of a performance history for the data flow in the respective child nodes of the graph according to the described execution of the respective revisions of the one or more transformations specified in the one or more requests.