CPC G06F 9/4881 (2013.01) [G06F 9/485 (2013.01)] | 27 Claims |
1. An apparatus comprising at least one processor and a storage to store instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
receive, at the at least one processor, and from a requesting device via a network, a request to perform a job flow comprising a set of tasks;
within a performance container, the at least one processor is caused to output a first task routine execution request message;
within a first task container, and in response to the first task routine execution request message, the at least one processor is caused to perform operations of a first task comprising:
access a first data object within at least one federated area to determine whether the first data object is already divided into a first set of data object blocks;
in response to a determination that the first data object is not already divided, perform operations comprising:
analyze the first data object to determine a size of the first data object;
analyze a data structure by which data values are organized within the first data object to identify an atomic unit of storage of data values within the data structure, and to determine a size of the atomic unit;
based on at least the size of the first data object, the size of the atomic unit, and storage resources allocated to task containers, determine a quantity of data object blocks into which to divide the first data object;
divide the first data object into the quantity of data object blocks to generate the first set of data object blocks; and
output a first task completion message comprising a first set of data block identifiers, wherein each data block identifier of the first set of data block identifiers indicates a location within the at least one federated area at which a different data object block of the first set of data object blocks is stored; and
in response to a determination that the first data object is already divided, perform operations comprising:
retrieve the first set of data block identifiers from the at least one federated area; and
output the first task completion message comprising the first set of data block identifiers; and
within the performance container, and in response to the first task completion message, the at least one processor is caused to output a first set of task routine execution request messages to cause a second task to be performed by executing multiple instances of a task routine within multiple task containers at least partially in parallel, wherein:
each task routine execution request message of the first set of task routine execution request messages includes a different data block identifier of the first set of data block identifiers to cause the at least one processor to execute each instance of the task routine using a different data object block of the first set of data object blocks as an input.
|