US 11,870,840 B2
Distributed partitioned map reduce using a data fabric
Jason Howes, Somerville, MA (US); and Noah Arliss, Lexington, MA (US)
Assigned to Workday, Inc., Pleasanton, CA (US)
Filed by Workday, Inc., Pleasanton, CA (US)
Filed on Jan. 12, 2018, as Appl. No. 15/870,407.
Prior Publication US 2019/0222633 A1, Jul. 18, 2019
Int. Cl. G06F 12/00 (2006.01); G06F 11/10 (2006.01); G06F 12/06 (2006.01); G06F 12/10 (2016.01); H04L 67/10 (2022.01); G06F 9/50 (2006.01); H04L 67/1001 (2022.01)
CPC H04L 67/10 (2013.01) [G06F 9/5066 (2013.01); H04L 67/1001 (2022.05)] 17 Claims
OG exemplary drawing
 
1. A system for a distributed partitioned map reduce, comprising:
a plurality of nodes; and
a processor configured to:
select a service node from the plurality of nodes to manage execution of a task based at least in part on a request to perform the task provided by a requestor, wherein the service node is selected based at least in part on a least loaded node of the plurality of nodes or a least recently selected node of the plurality of nodes, wherein each node of the plurality of nodes comprises: 1) data stored in a plurality of partitions and 2) partition metadata stored in a partition map, wherein the partition metadata identifies each partition of the plurality of partitions as a primary partition or a backup partition; and
provide the task to the service node, wherein the service node is configured to:
receive the task;
provide partition task logic of the task to the plurality of partition nodes, wherein the partition task logic is provided to a first partition node of the plurality of partition nodes, wherein the first partition node includes a first partition map and a first plurality of partitions, and wherein, for each partition of the first plurality of partitions, the first partition node:
determines, using the first partition map, whether a partition of the first plurality of partitions is a primary partition or a backup partition, wherein the partition is a first partition, and wherein the first partition node further: a) determines, using the first partition map, whether a second partition of the first plurality of partitions is a primary partition or a backup partition; and b) in response to a determination that the second partition is a backup partition, does not execute the partition map operation on the second partition; and
in response to a determination that the partition is a primary partition, executes the partition map operation on the partition to obtain a partition result;
receive node results from the plurality of nodes, wherein the node results include a first node result from the first partition node, wherein the first node result is determined by the first partition node using the partition result, and wherein the first node result is determined by executing a node reduce step on partition results for partitions of the first plurality of partitions;
aggregate the node results into a service node result; and
provide the service node result.