US 12,346,289 B1
Cross-platform flexible data model for dynamic storage, management, and retrieval of high-volume object data
Jill Theresa Vandenbosch, Toronto (CA); Conrad Banneker Owen, Toronto (CA); Aleksandar Djuric, Toronto (CA); Karanbir Singh Randhawa, Toronto (CA); Lakshmanan Arumugam, Toronto (CA); Matthew Michael Burbidge, Herriman, UT (US); Nicholas Cernek, Knoxville, TN (US); Scott Michael Nielsen, South Jordan, UT (US); and Travis Bennett Martin, Salt Lake City, UT (US)
Assigned to Recursion Pharmaceuticals, Inc., Salt Lake City, UT (US)
Filed by Recursion Pharmaceuticals, Inc., Salt Lake City, UT (US)
Filed on Oct. 24, 2023, as Appl. No. 18/493,298.
Int. Cl. G06F 16/11 (2019.01); G06F 16/182 (2019.01); G06N 3/084 (2023.01)
CPC G06F 16/119 (2019.01) [G06F 16/182 (2019.01); G06N 3/084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
generating a cross-platform metadata database comprising a plurality of file identifiers and comprising metadata for a plurality of digital files stored across a plurality of digital repository platforms storing machine learning data;
generating a cross-platform file location database comprising the plurality of file identifiers and comprising a plurality of file locations for the plurality of digital files across the plurality of digital repository platforms storing machine learning data;
receiving, via one or more servers from a requestor device, a machine learning dataset request comprising one or more characteristics for a machine learning dataset;
determining, via the one or more servers, file identifiers for a set of digital files for the machine learning dataset by searching, the metadata of the cross-platform metadata database utilizing the one or more characteristics from the machine learning dataset request;
searching, via the one or more servers, the plurality of file identifiers of the cross-platform file location database utilizing the file identifiers for the set of digital files for the machine learning dataset determined from the cross-platform metadata database to identify digital storage locations corresponding to the plurality of digital repository platforms for the set of digital files for the machine learning dataset by:
identifying a first storage location for a first digital file stored at a first digital repository platform utilizing a first file identifier; and
identifying a second storage location for a second digital file stored at a second digital repository platform utilizing a second file identifier;
generating a machine learning dataset response, for the requestor device, indicating the digital storage locations of the set of digital files for the machine learning dataset, the machine learning dataset response comprising the first storage location for the first digital file stored at the first digital repository platform and the second storage location for the second digital file stored at the second digital repository platform; and
training a machine learning model utilizing the set of digital files for the machine learning dataset.