US 11,687,241 B2
System and method for data compaction utilizing mismatch probability estimation
Joshua Cooper, Columbia, SC (US); Aliasghar Riahi, Orinda, CA (US); Mojgan Haddad, Orinda, CA (US); Ryan Kourosh Riahi, Orinda, CA (US); Razmin Riahi, Orinda, CA (US); and Charles Yeomans, Orinda, CA (US)
Assigned to ATOMBEAM TECHNOLOGIES INC., Moraga, CA (US)
Filed by AtomBeam Technologies Inc., Moraga, CA (US)
Filed on Oct. 26, 2022, as Appl. No. 17/974,230.
Application 17/974,230 is a continuation in part of application No. 17/884,470, filed on Aug. 9, 2022.
Application 17/884,470 is a continuation of application No. 17/727,913, filed on Apr. 25, 2022.
Application 17/727,913 is a continuation of application No. 17/404,699, filed on Aug. 17, 2021, granted, now 11,385,794, issued on Jul. 12, 2022.
Application 17/404,699 is a continuation in part of application No. 16/455,655, filed on Jun. 27, 2019, granted, now 10,509,771, issued on Dec. 17, 2019.
Application 16/455,655 is a continuation in part of application No. 16/200,466, filed on Nov. 26, 2018, granted, now 10,476,519, issued on Nov. 12, 2019.
Application 16/200,466 is a continuation in part of application No. 15/975,741, filed on May 9, 2018, granted, now 10,303,391, issued on May 28, 2019.
Claims priority of provisional application 62/578,824, filed on Oct. 30, 2017.
Claims priority of provisional application 63/232,050, filed on Aug. 11, 2021.
Prior Publication US 2023/0043546 A1, Feb. 9, 2023
Int. Cl. G06F 3/06 (2006.01); H03M 7/30 (2006.01)
CPC G06F 3/0608 (2013.01) [G06F 3/067 (2013.01); G06F 3/0623 (2013.01); G06F 3/0659 (2013.01); H03M 7/6005 (2013.01); H03M 7/6011 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A system for encoding data using mismatch probability estimation, comprising:
a computing device comprising a processor, a memory, and a non-volatile data storage device;
a statistical analyzer comprising a first plurality of programming instructions stored in the memory which, when operating on the processor, causes the computing device to:
receive a training data set for encoding, the training data set comprising sourceblocks of data;
determine a frequency of occurrence of each sourceblock of the training data set;
calculate a mismatch probability estimate comprising a probability that any given sourceblock in a non-training data set to be later received for encoding will not be a sourceblock that was contained in the training data set;
generate a mismatch sourceblock representing sourceblocks that were not contained in the training data set, and assign the mismatch probability estimate to the mismatch sourceblock as the frequency of occurrence of the mismatch sourceblock; and
a codebook generator comprising a second plurality of programming instructions stored in the memory which, when operating on the processor, causes the computing device to:
generate a codebook from the sourceblocks of the training data set and the mismatch sourceblock using an entropy encoding method wherein codewords are assigned to each sourceblock based on its frequency of occurrence.