US 12,438,943 B2
System and method for offloading preprocessing of machine learning data to remote storage
Debojyoti Dutta, San Jose, CA (US); Johnu George, Bangalore (IN); Manosiz Bhattacharyya, San Jose, CA (US); and Roger Liao, San Jose, CA (US)
Assigned to Nutanix, Inc., San Jose, CA (US)
Filed by Nutanix, Inc., San Jose, CA (US)
Filed on Nov. 4, 2022, as Appl. No. 17/981,077.
Claims priority of application No. 202141052691 (IN), filed on Nov. 17, 2021.
Prior Publication US 2023/0156083 A1, May 18, 2023
Int. Cl. H04L 67/1097 (2022.01); G06F 9/455 (2018.01)
CPC H04L 67/1097 (2013.01) [G06F 9/45533 (2013.01)] 23 Claims
OG exemplary drawing
 
1. An apparatus comprising a processor and a memory, wherein the memory includes programmed instructions that, when executed by the processor, cause the apparatus to:
assign, by a resource scheduler, a first virtualized compute resource to a storage node of an object store on a first cloud, the storage node including a virtualized storage resource, wherein unstructured data is stored in the storage node;
preprocess, by the first virtualized compute resource of the storage node, at the storage node on the first cloud, the unstructured data stored in the storage node to generate preprocessed data;
transfer, via a public network, the preprocessed data generated by the first virtualized compute resource of the storage node to a compute node on a client system separate from the first cloud; and
assign, by the resource scheduler, a second virtualized compute resource to the compute node of the client system, wherein the second compute resource trains a machine learning (ML) model using the preprocessed data.