US 11,755,320 B2
	Compute array of a processor with mixed-precision numerical linear algebra support
Jose E. Moreira, Irvington, NY (US); Brett Olsson, Cary, NC (US); Brian W. Thompto, Austin, TX (US); Silvia Melitta Mueller, Altdorf (DE); and Andreas Wagner, Weil im Schönbuch (DE)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Sep. 21, 2021, as Appl. No. 17/480,279.
Application 17/480,279 is a continuation of application No. 16/712,087, filed on Dec. 12, 2019, granted, now 11,188,328.
Prior Publication US 2022/0004386 A1, Jan. 6, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/38 (2018.01); G06F 9/30 (2018.01); G06F 17/16 (2006.01)

CPC G06F 9/30014 (2013.01) [G06F 9/30145 (2013.01); G06F 9/3855 (2013.01); G06F 9/3893 (2013.01); G06F 17/16 (2013.01)]

20 Claims

1. A computer-implemented method comprising:

determining, by a processor, a first precision and a first shape of a first input matrix to a compute array of the processor, wherein the processor comprises an instruction fetch/decode unit operable to fetch and decode a plurality of instructions comprising at least one instruction to perform a plurality of linear algebra operations, a dispatch/issue unit operable to dispatch the instructions to an issue queue after decoding, and the compute array is associated with the issue queue;

determining, by the processor, a second precision and a second shape of a second input matrix to the compute array of the processor; and

repeating a plurality of linear algebra operations in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.