| CPC G06F 16/1752 (2019.01) [G06F 3/0608 (2013.01); G06F 3/0641 (2013.01); G06F 3/067 (2013.01)] | 6 Claims |

|
1. A system for secure deduplication of compacted data, comprising:
at least one reference codebook comprising key-value pairs of data;
a library manager comprising at least a processor, a memory, and a plurality of programming instructions stored in the memory and operable on the processor of a computing device, wherein the plurality of programming instructions, when operating on the processor, cause the processor to:
receive a plurality of deconstructed sourceblocks from a data deconstruction engine;
perform secure data deduplication by comparing each of the plurality of deconstructed sourceblocks with sourceblocks already contained in the reference codebook, wherein:
the library manager uses machine learning algorithms to dynamically optimize sourceblock size based on data patterns and storage efficiency metrics;
access to both the reference codebook and a returned reference code is required to reconstruct to an original sourceblock; and
the reference codebook and the returned reference codes are stored separately from one another;
return the reference code to the data deconstruction engine, when the sourceblock received is a duplicate of an existing sourceblock in the reference codebook; and
for each received deconstructed sourceblock that is not present in the codebook:
create a new, unique reference code for the respective deconstructed sourceblock using machine learning algorithms that dynamically optimize reference code generation based on frequency analysis of previously stored souceblocks and predicted future data patterns;
store both the respective deconstructed sourceblock and the associated reference code in the reference codebook as a key-value pair; and
return the new reference code to the data deconstruction engine.
|