US 11,656,909 B2
Tensor accelerator capable of increasing efficiency of data sharing
Shao-Yi Chien, Taipei (TW); Yu-Sheng Lin, Yunlin County (TW); and Wei-Chao Chen, Taipei (TW)
Assigned to National Taiwan University, Taipei (TW)
Filed by National Taiwan University, Taipei (TW)
Filed on Apr. 15, 2021, as Appl. No. 17/231,011.
Prior Publication US 2022/0334880 A1, Oct. 20, 2022
Int. Cl. G06F 9/50 (2006.01); G06F 7/57 (2006.01); G06F 9/54 (2006.01)
CPC G06F 9/5027 (2013.01) [G06F 7/57 (2013.01); G06F 9/544 (2013.01)] 12 Claims
OG exemplary drawing
 
1. A tensor accelerator comprising:
a first tile execution unit comprising:
a first buffer comprising a plurality of first memory cells;
a plurality of first arithmetic logic units;
a first network coupled to the plurality of first memory cells; and
a first selector coupled to the first network and the plurality of first arithmetic logic units;
a second tile execution unit comprising:
a second buffer comprising a plurality of second memory cells;
a plurality of second arithmetic logic units;
a second network coupled to the plurality of second memory cells; and
a second selector coupled to the second network and the plurality of second arithmetic logic units; and
a bidirectional queue coupled between the first selector and the second selector;
wherein the first selector comprises a plurality of switches each comprising:
a first input port coupled to the first network for receiving a first input signal from the first network;
a second input port coupled to the bidirectional queue for receiving a second input signal from the bidirectional queue;
a first output port coupled to a first arithmetic logic unit of the first arithmetic logic units for outputting the first input signal from the first input port or the second input signal from the second input port; and
a second output port coupled to the bidirectional queue for outputting the first input signal from the first input port.