US 12,141,662 B1
	Parallelizable distributed data preservation apparatuses, methods and systems
Neil Couture, Thornhill (CA); Babak Afshin-Pour, Oakville (CA); and Anthony J. Iacovone, Huntington, NY (US)
Assigned to ADTHEORENT, INC., New York, NY (US)
Filed by AdTheorent, Inc., New York, NY (US)
Filed on Jun. 26, 2017, as Appl. No. 15/633,676.
Application 15/633,676 is a continuation in part of application No. 13/797,903, filed on Mar. 12, 2013, granted, now 11,288,240.
Application 15/633,676 is a continuation in part of application No. 13/797,873, filed on Mar. 12, 2013.
Claims priority of provisional application 62/354,686, filed on Jun. 24, 2016.
Int. Cl. G06N 20/00 (2019.01); G06F 16/23 (2019.01); G06Q 30/0273 (2023.01)

CPC G06N 20/00 (2019.01) [G06F 16/2365 (2019.01); G06Q 30/0275 (2013.01)]

32 Claims

1. A real-time parallelized data integrity preservation apparatus, comprising:

at least one memory;

a component collection stored in the at least one memory;

any of at least one processor disposed in communication with the at least one memory, the any of at least one processor executing processor-executable instructions from the component collection, the component collection storage structured with processor-executable instructions comprising:

obtain an original dataset data structure from a plurality of data source types using a symmetry machine learning component;

determine, based on the obtained original dataset data structure, an appropriate type of symmetry machine learning basic element table;

generate original data distribution estimation data structure from the original dataset data structure;

generate new dataset random generation data structure from the original data distribution estimation data structure;

generate new random dataset transformation data structure by factorizing the new dataset random generation data structure;

transform the original dataset data structure with the symmetry machine learning basic element table and the new random dataset transformation data structure into a pseudo random dataset data structure;

provide the pseudo random dataset data structure to a machine learning component; and

generate build classifier and build regression structures from the machine learning component.