US 12,130,884 B2
Dataflow accelerator architecture for general matrix-matrix multiplication and tensor computation in deep learning
Peng Gu, Santa Barbara, CA (US); Krishna Malladi, San Jose, CA (US); Hongzhong Zheng, Los Gatos, CA (US); and Dimin Niu, Sunnyvale, CA (US)
Assigned to SAMSUNG ELECTRONICS CO., LTD., (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Jul. 13, 2021, as Appl. No. 17/374,988.
Application 17/374,988 is a continuation of application No. 16/388,860, filed on Apr. 18, 2019, granted, now 11,100,193.
Claims priority of provisional application 62/777,046, filed on Dec. 7, 2018.
Prior Publication US 2021/0374210 A1, Dec. 2, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 17/16 (2006.01); G06F 12/0802 (2016.01); G06F 12/0877 (2016.01); G06N 3/008 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2023.01); G06N 3/08 (2023.01)
CPC G06F 17/16 (2013.01) [G06F 12/0802 (2013.01); G06F 12/0877 (2013.01); G06N 3/008 (2013.01); G06N 3/045 (2023.01); G06N 3/063 (2013.01); G06F 2212/1024 (2013.01); G06F 2212/1036 (2013.01); G06F 2212/22 (2013.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a memory;
a lookup data structure stored in the memory;
a first vector buffer configured to store a first vector that is used as a first address into the lookup data structure; and
a second vector buffer configured to store a second vector that is used as a second address into the lookup data structure;
wherein the lookup data structure is configured to provide a result based on a lookup operation, the result being generated based on a computation of the first vector and the second vector.