US 12,405,930 B2
System and method for identifying poisoned data during data curation using data source characteristics
Ofir Ezrielev, Beer Sheva (IL); Hanna Yehuda, Acton, MA (US); and Kristen Jeanne Walsh, Austin, TX (US)
Assigned to Dell Products L.P., Round Rock, TX (US)
Filed by Dell Products L.P., Round Rock, TX (US)
Filed on Jun. 29, 2023, as Appl. No. 18/343,946.
Prior Publication US 2025/0005147 A1, Jan. 2, 2025
Int. Cl. G06F 16/215 (2019.01)
CPC G06F 16/215 (2019.01) 20 Claims
OG exemplary drawing
 
1. A method for curating data from data sources prior to addition to a data repository, comprising:
making an identification that the data comprises poisoned data; and
based on the identification:
obtaining a fitness analysis function based on criteria for evaluating potentially poisoned data, the criteria comprising:
an historical security posture for each data source of the data sources;
a current security posture of each data source; and
a number of data sources providing the data,
performing an optimization process using the data to identify the poisoned data, the optimization process comprises generating test proposals that indicate different delineations between the potentially poisoned data and potentially unpoisoned data, and evaluating the test proposals using the fitness analysis function, and
initiating performance of an action set, based on the identified poisoned data, to manage an impact of the identified poisoned data.