| CPC G06F 16/215 (2019.01) [G06F 11/0793 (2013.01); G06F 16/2246 (2019.01); G06F 16/2365 (2019.01)] | 20 Claims |

|
1. A method comprising:
executing a job on one or more workers of a compute resource, wherein executing the one or more jobs further comprises invoking one or more data entities;
detecting that a data entity in the one or more data entities is corrupt in response to determination that execution of the job has failed;
identifying a lineage data identifier associated with the data entity based on a mapping of lineage data identifiers to data entity identifiers;
accessing lineage data that is stored in association with the identified lineage data identifier, the lineage data having been generated based on a query tree that was used to generate the data entity, and the lineage data identifying a set of data entities that rely on the data entity;
identifying, based on the lineage data, one or more upstream data entities from the data entity;
determining that the one or more upstream data entities from which the data entity depends from has been corrupted; and
providing an indication of the corruption in the one or more upstream data entities or the data entity to a client device.
|