| CPC G06F 15/8053 (2013.01) [G06F 7/5443 (2013.01); G06F 7/57 (2013.01); G06F 9/3818 (2013.01); G06F 9/3828 (2013.01); G06F 9/3877 (2013.01)] | 18 Claims |

|
1. A coprocessor comprising:
a plurality of processing element circuits arranged in a grid of rows and columns, wherein a given processing element circuit of the plurality of processing element circuits comprises an arithmetic-logic unit (ALU) circuit configured to perform one or more ALU operations on a plurality of input operands to generate a result; and
a queue circuit coupled to the plurality of processing element circuits and including a scheduler circuit configured to issue instruction operations to the plurality of processing element circuits, wherein a first given instruction operation is of either a matrix mode type that causes computations in multiple rows of the grid or a vector mode type that causes computations in a first row of the grid, and wherein the scheduler circuit is configured to concurrently issue, as fused instruction operations, a second given instruction operation with the first given instruction operation based on the first given instruction operation being of the vector mode type and further based on the second given instruction operation being of the vector mode type and using a second row of the grid different from the first row, wherein the first row and the second row are determined based on respective destination identifiers of the first given instruction operation and the second given instruction operation.
|