US 12,205,019 B2
Data layout conscious processing in memory architecture for executing neural network model
Minxuan Zhou, San Mateo, CA (US); Weifeng Zhang, San Mateo, CA (US); and Guoyang Chen, San Mateo, CA (US)
Assigned to Alibaba Group Holding Limited, Grand Cayman (KY)
Filed by ALIBABA GROUP HOLDING LIMITED, George Town (KY)
Filed on Nov. 19, 2019, as Appl. No. 16/688,889.
Prior Publication US 2021/0150311 A1, May 20, 2021
Int. Cl. G06N 3/065 (2023.01); G06F 15/78 (2006.01); G06N 3/04 (2023.01); G11C 16/10 (2006.01)
CPC G06N 3/065 (2023.01) [G06F 15/7821 (2013.01); G06N 3/04 (2013.01); G11C 16/10 (2013.01)] 29 Claims
OG exemplary drawing
 
1. A processing in memory (PIM) enabled device for executing a neural network model, comprising:
a memory block assembly with a plurality of rows, comprising:
a first array of memory blocks arranged within the plurality of rows;
a second array of memory blocks arranged within the plurality of rows adjacent to the first array of memory blocks;
a plurality of first data links associated with the first array of memory blocks and the second array of memory blocks, wherein each data link of the plurality of first data links communicatively couples two corresponding memory blocks of a same row of the plurality of rows from the first array of memory blocks and the second array of memory blocks respectively; and
a second data link comprising two parallel column data links arranged between the first array of memory blocks and the second array of memory blocks and directly communicatively coupled to the plurality of first data links, respectively,
wherein data from a first memory block of a first row of the first array of memory blocks is transferable to a second memory block of a second row different from the first row of the second array of memory blocks via the plurality of first data links and either or both of the two column data links of the second data link.