US 12,111,791 B2
Using machine learning to select compression algorithms for compressing binary datasets
John Krasner, Coventry, RI (US); and Sweetesh Singh, Benares (IN)
Assigned to DELL PRODUCTS, L.P., Hopkinton, MA (US)
Filed by EMC IP HOLDING COMPANY LLC, Hopkinton, MA (US)
Filed on Dec. 7, 2020, as Appl. No. 17/113,237.
Prior Publication US 2022/0179829 A1, Jun. 9, 2022
Int. Cl. G06F 16/174 (2019.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06F 16/1744 (2019.01) [G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
at least one compute node that manages access to non-volatile storage, the compute node configured to respond to commands from host nodes to access host application data stored on the non-volatile storage, wherein the host application data comprises binary data;
a data model that has been trained to predict compression efficiency of binary data structures by a plurality of data compression algorithms based on sizes of components of the binary data structures, where each of the binary data structures comprises a header, metadata, signature, encoding, and a plurality of the components; and
a recommendation engine that uses the data model to determine which one of the plurality of data compression algorithms will most efficiently compress selected binary data and recommends that compression algorithm;
wherein the compute node compresses the selected binary data using the recommended compression algorithm.