US 11,748,158 B2
Data object preparation for execution of multiple task routine instances in many task computing
Henry Gabriel Victor Bequet, Cary, NC (US); Ronald Earl Stogner, Cary, NC (US); Eric Jian Yang, Morrisville, NC (US); and Chaowang “Ricky” Zhang, Morrisville, NC (US)
Assigned to SAS Institute Inc., Cary, NC (US)
Filed by SAS Institute Inc., Cary, NC (US)
Filed on Dec. 30, 2022, as Appl. No. 18/91,569.
Application 18/091,569 is a continuation in part of application No. 17/733,196, filed on Apr. 29, 2022.
Application 17/733,196 is a continuation of application No. 17/733,090, filed on Apr. 29, 2022.
Application 17/733,090 is a continuation in part of application No. 17/682,783, filed on Feb. 28, 2022, granted, now 11,474,863.
Application 17/682,783 is a continuation in part of application No. 17/563,697, filed on Dec. 28, 2021, granted, now 11,513,850.
Application 17/563,697 is a continuation of application No. 17/558,237, filed on Dec. 21, 2021, granted, now 11,455,190, issued on Sep. 27, 2022.
Application 17/558,237 is a continuation in part of application No. 17/308,355, filed on May 5, 2021, granted, now 11,204,809, issued on Dec. 21, 2021.
Application 17/308,355 is a continuation of application No. 17/225,023, filed on Apr. 7, 2021, granted, now 11,169,788, issued on Nov. 9, 2021.
Application 17/225,023 is a continuation in part of application No. 17/139,364, filed on Dec. 31, 2020, granted, now 11,144,293, issued on Oct. 12, 2021.
Application 17/139,364 is a continuation in part of application No. 17/064,577, filed on Oct. 6, 2020, granted, now 11,080,031, issued on Aug. 3, 2021.
Application 17/064,577 is a continuation in part of application No. 16/814,481, filed on Mar. 10, 2020, granted, now 10,795,935, issued on Oct. 6, 2020.
Application 16/814,481 is a continuation in part of application No. 16/708,179, filed on Dec. 9, 2019, granted, now 10,740,076, issued on Aug. 11, 2020.
Application 16/708,179 is a continuation in part of application No. 16/587,965, filed on Sep. 30, 2019, granted, now 10,650,046, issued on May 12, 2020.
Claims priority of provisional application 63/336,771, filed on Apr. 29, 2022.
Claims priority of provisional application 63/157,419, filed on Mar. 5, 2021.
Claims priority of provisional application 63/159,428, filed on Mar. 10, 2021.
Claims priority of provisional application 63/185,570, filed on May 7, 2021.
Claims priority of provisional application 63/252,070, filed on Oct. 4, 2021.
Claims priority of provisional application 63/139,703, filed on Jan. 20, 2021.
Claims priority of provisional application 63/006,516, filed on Apr. 7, 2020.
Claims priority of provisional application 63/008,830, filed on Apr. 13, 2020.
Claims priority of provisional application 63/015,274, filed on Apr. 24, 2020.
Claims priority of provisional application 63/029,989, filed on May 26, 2020.
Claims priority of provisional application 62/972,240, filed on Feb. 10, 2020.
Claims priority of provisional application 62/985,455, filed on Mar. 5, 2020.
Claims priority of provisional application 62/816,160, filed on Mar. 10, 2019.
Claims priority of provisional application 62/776,691, filed on Dec. 7, 2018.
Claims priority of provisional application 62/739,314, filed on Sep. 30, 2018.
Prior Publication US 2023/0147225 A1, May 11, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/48 (2006.01)
CPC G06F 9/4881 (2013.01) [G06F 9/485 (2013.01)] 27 Claims
OG exemplary drawing
 
1. An apparatus comprising at least one processor and a storage to store instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
receive, at the at least one processor, and from a requesting device via a network, a request to perform a job flow comprising a set of tasks;
within a performance container, the at least one processor is caused to output a first task routine execution request message;
within a first task container, and in response to the first task routine execution request message, the at least one processor is caused to perform operations of a first task comprising:
access a first data object within at least one federated area to determine whether the first data object is already divided into a first set of data object blocks;
in response to a determination that the first data object is not already divided, perform operations comprising:
analyze the first data object to determine a size of the first data object;
analyze a data structure by which data values are organized within the first data object to identify an atomic unit of storage of data values within the data structure, and to determine a size of the atomic unit;
based on at least the size of the first data object, the size of the atomic unit, and storage resources allocated to task containers, determine a quantity of data object blocks into which to divide the first data object;
divide the first data object into the quantity of data object blocks to generate the first set of data object blocks; and
output a first task completion message comprising a first set of data block identifiers, wherein each data block identifier of the first set of data block identifiers indicates a location within the at least one federated area at which a different data object block of the first set of data object blocks is stored; and
in response to a determination that the first data object is already divided, perform operations comprising:
retrieve the first set of data block identifiers from the at least one federated area; and
output the first task completion message comprising the first set of data block identifiers; and
within the performance container, and in response to the first task completion message, the at least one processor is caused to output a first set of task routine execution request messages to cause a second task to be performed by executing multiple instances of a task routine within multiple task containers at least partially in parallel, wherein:
each task routine execution request message of the first set of task routine execution request messages includes a different data block identifier of the first set of data block identifiers to cause the at least one processor to execute each instance of the task routine using a different data object block of the first set of data object blocks as an input.