US 11,934,826 B2
	Vector reductions using shared scratchpad memory
Thomas Norrie, Mountain View, CA (US); Gurushankar Rajamani, Sunnyvale, CA (US); Andrew Everett Phelps, Middleton, WI (US); Matthew Leever Hedlund, Madison, WI (US); and Norman Paul Jouppi, Palo Alto, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Nov. 19, 2021, as Appl. No. 17/530,869.
Application 17/530,869 is a continuation of application No. 17/007,569, filed on Aug. 31, 2020, granted, now 11,182,159.
Claims priority of provisional application 62/981,957, filed on Feb. 26, 2020.
Prior Publication US 2022/0156071 A1, May 19, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/30 (2018.01); G06F 13/28 (2006.01); G06F 15/78 (2006.01); G06N 3/045 (2023.01)

CPC G06F 9/30036 (2013.01) [G06F 9/3001 (2013.01); G06F 9/3004 (2013.01); G06F 13/28 (2013.01); G06F 15/7821 (2013.01); G06N 3/045 (2023.01)]

8 Claims

1. A method performed using an integrated circuit for a hardware machine-learning accelerator that includes a plurality of cores and a shared memory that communicates with each of the plurality of cores, the method comprising:

generating, by each of the plurality of cores, a respective vector of values;

performing, across the plurality of cores and into a shared memory cell in the shared memory, a plurality of atomic vector reductions using each of the respective vectors and an operator unit of the shared memory without synchronization; and

generating a result vector based on the plurality of atomic vector reductions.