US 12,346,286 B2
	Two-dimensional processing array with a vertically stacked memory tile array
Shuangchen Li, Sunnyvale, CA (US); Zhe Zhang, Shanghai (CN); Dimin Niu, San Mateo, CA (US); and Hongzhong Zheng, Los Gatos, CA (US)
Assigned to ALIBABA (CHINA) CO., LTD., Zhejiang Province (CN)
Filed by ALIBABA (CHINA) CO., LTD., Zhejiang Province (CN)
Filed on Dec. 12, 2022, as Appl. No. 18/064,520.
Claims priority of application No. 202210967128.X (CN), filed on Aug. 12, 2022.
Prior Publication US 2024/0054096 A1, Feb. 15, 2024
Int. Cl. G06F 15/78 (2006.01); G06F 12/0813 (2016.01); G06F 15/173 (2006.01); G06F 15/80 (2006.01)

CPC G06F 15/17381 (2013.01) [G06F 12/0813 (2013.01); G06F 15/7807 (2013.01); G06F 15/7825 (2013.01); G06F 15/8023 (2013.01)]

14 Claims

1. A multi-core processor, used in performing parallel computation, and the multi-core processor comprising:

a logic die, comprising:

a plurality of processor cores, wherein each processor core is programmable; and

a plurality of networks on chip, wherein the plurality of networks on chip correspondingly connected to the plurality of processor cores, so that the plurality of processor cores form a two-dimensional mesh network; and

a memory die, vertically stacked with the processor core, the memory die comprising:

a plurality of memory tiles, wherein when the multi-core processor performs the parallel computation, the plurality of memory tiles do not have cache coherency,

wherein, the plurality of memory tiles correspond to the plurality of processor cores in a one-to-one or one-to-many manner,

wherein, each processor core comprises:

a computation logic module, coupled to the corresponding network on chip; and

a memory interface module electrically coupled to corresponding the computation module and used in accessing the corresponding memory tile of the plurality of memory tiles without passing through the networks on chip, so that the computation logic module accesses the corresponding memory tile through the corresponding memory interface module without passing through the networks on chip, and wherein each of the computation logic module is allowed to access the corresponding memory tile and is unallowed to access the memory tiles other than the corresponding memory tile.