US 12,236,338 B2
Single function to perform combined matrix multiplication and bias add operations
Cedric Lichtenau, Stuttgart (DE); Kailash Gopalakrishnan, New York, NY (US); Vijayalakshmi Srinivasan, New York, NY (US); Sunil K. Shukla, Scarsdale, NY (US); and Swagath Venkataramani, Yonkers, NY (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Jun. 17, 2021, as Appl. No. 17/350,665.
Prior Publication US 2022/0405556 A1, Dec. 22, 2022
Int. Cl. G06N 3/063 (2023.01); G06F 7/50 (2006.01); G06F 7/523 (2006.01); G06F 7/544 (2006.01); G06F 9/30 (2018.01); G06F 17/16 (2006.01); G06N 3/08 (2023.01)
CPC G06N 3/063 (2013.01) [G06F 7/50 (2013.01); G06F 7/523 (2013.01); G06F 7/5443 (2013.01); G06F 9/3001 (2013.01); G06F 9/30036 (2013.01); G06F 17/16 (2013.01); G06N 3/08 (2013.01)] 25 Claims
OG exemplary drawing
 
1. A computer program product for facilitating processing within a computing environment, the computer program product comprising:
one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media to perform a method comprising:
performing a combined function specified by an instruction, the combined function including a plurality of operations performed as part of one invocation of the combined function, wherein the performing the combined function comprises:
performing a matrix multiplication of a first tensor and a second tensor to obtain one or more intermediate results, the second tensor comprising an adjusted weight tensor created using a multiplier; and
adding values of a bias tensor to the one or more intermediate results to obtain one or more results for the combined function, the one or more results being at least a part of an output tensor, and wherein the one or more intermediate results are input to the adding absent a storing and reloading of the one or more intermediate results in a location externally accessible to one or more processors.