US 11,842,197 B2
System and methods for tag-based synchronization of tasks for machine learning operations
Avinash Sodani, San Jose, CA (US); and Gopal Nalamalapu, Santa Clara, CA (US)
Assigned to Marvell Asia Pte Ltd, Singapore (SG)
Filed by Marvell Asia Pte Ltd, Singapore (SG)
Filed on Feb. 28, 2023, as Appl. No. 18/115,206.
Application 18/115,206 is a continuation of application No. 16/864,049, filed on Apr. 30, 2020, granted, now 11,604,683.
Claims priority of provisional application 62/950,745, filed on Dec. 19, 2019.
Prior Publication US 2023/0205540 A1, Jun. 29, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/38 (2018.01); G06F 9/52 (2006.01); G06F 9/48 (2006.01); G06F 15/80 (2006.01); G06F 15/76 (2006.01)
CPC G06F 9/3851 (2013.01) [G06F 9/3836 (2013.01); G06F 9/3869 (2013.01); G06F 9/4881 (2013.01); G06F 9/52 (2013.01); G06F 9/522 (2013.01); G06F 15/80 (2013.01); G06F 9/38 (2013.01); G06F 15/76 (2013.01)] 22 Claims
OG exemplary drawing
 
1. A hardware-based programmable architecture to support tag-based synchronization comprising:
a plurality of processing blocks arranged in a two-dimensional array of a plurality of rows and columns, wherein each of the plurality of processing blocks comprises a plurality of processing tiles connected to one another; and
an instruction streaming engine configured to
accept a first task for an operation, wherein the first task has a set tag for determining whether one or more subsequent tasks need to be synchronized with the first task;
save the set tag of the first task to a tag table;
transmit and route instructions of the first task to a set of processing tiles to be executed based on a destination mask of the first task if the first task does not have an instruction sync tag;
accept a second task for the operation, wherein the second task has an instruction sync tag to synchronize the second task with a prior task;
match the instruction sync tag of the second task with entries in the tag table;
hold instructions of the second task if a match is found until the matching entry in the tag table is invalidated or removed, indicating that the synchronization with the prior task the second task depends on is done; and
release and transmit the instructions of the second task to a set of processing tiles to be executed based on a destination mask of the second task.