US 11,900,151 B2
Architecture to support color scheme-based synchronization for machine learning
Avinash Sodani, San Jose, CA (US); Senad Durakovic, Palo Alto, CA (US); and Gopal Nalamalapu, Santa Clara, CA (US)
Assigned to Marvell Asia Pte Ltd, Singapore (SG)
Filed by Marvell Asia Pte, Ltd., Singapore (SG)
Filed on Apr. 22, 2021, as Appl. No. 17/237,752.
Application 17/237,752 is a continuation of application No. 16/420,055, filed on May 22, 2019, granted, now 11,016,801.
Application 16/420,055 is a continuation in part of application No. 16/226,539, filed on Dec. 19, 2018, granted, now 10,824,433.
Claims priority of provisional application 62/675,076, filed on May 22, 2018.
Prior Publication US 2021/0240521 A1, Aug. 5, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/48 (2006.01); G06F 3/06 (2006.01); G06N 20/00 (2019.01); G06F 9/52 (2006.01); G06F 9/38 (2018.01); G06F 15/78 (2006.01); G06F 17/16 (2006.01); G06F 9/30 (2018.01); G06F 15/80 (2006.01); G06N 5/04 (2023.01)
CPC G06F 9/4818 (2013.01) [G06F 3/0604 (2013.01); G06F 3/0659 (2013.01); G06F 3/0673 (2013.01); G06F 9/4881 (2013.01); G06F 9/52 (2013.01); G06N 20/00 (2019.01); G06F 9/30018 (2013.01); G06F 9/30087 (2013.01); G06F 9/3869 (2013.01); G06F 9/3871 (2013.01); G06F 9/522 (2013.01); G06F 15/7807 (2013.01); G06F 15/7846 (2013.01); G06F 15/8053 (2013.01); G06F 17/16 (2013.01); G06N 5/04 (2013.01)] 23 Claims
OG exemplary drawing
 
1. A system, comprising:
an array-based inference engine comprising a plurality of processing tiles, wherein each processing tile comprises at least one or more of
an on-chip memory (OCM) configured to load and maintain data for local access by components in the processing tile;
one or more processing units configured to perform one or more computation tasks of an operation on data in the OCM by executing a set of task instructions; and
an instruction engine configured to
distribute said set of task instructions to corresponding processing tiles of the inference engine to control operations of the corresponding processing tiles via a tile mask having an indicator associated therewith, wherein the tile mask indicates which processing tiles the instructions should be delivered to;
synchronize said set of task instructions to be executed by each processing tile of the plurality of processing tiles, respectively, to wait current task at each processing tile to finish before changing the indicator to a different indicator and distributing a new set of task instructions based on the different indicator.