CPC G06F 16/2458 (2019.01) [G06F 16/188 (2019.01); G06F 16/215 (2019.01); G06F 16/285 (2019.01)] | 20 Claims |
1. A system for processing a large file, comprising:
one or more processors; and
a memory in communication with the one or more processors and storing instructions that, when executed by the one or more processors, are configured to cause the system to:
receive record data comprising a plurality of records, each of the plurality of records having a data format comprising a sequence of characters;
determine, based on a comparison of a size of the record data to a predetermined size threshold, an order of magnitude for a seed portion;
determine, based on the data format, a plurality of unique focus values, each of the plurality of unique focus values corresponding to a sub-group of the plurality of records, wherein each of the plurality of unique focus values correspond to a specified portion of the sequence of characters in the data format, a number of the plurality of unique focus values being based on the order of magnitude of the seed portion;
create a plurality of virtual processing units, each associated with a unique one of the plurality of unique focus values; and
process, by each of the plurality of virtual processing units, the corresponding sub-group of the plurality of records that corresponds to the focus value associated with the respective virtual processing unit.
|