US 12,282,526 B2
	Application programming interface to accelerate matrix operations
Piotr Majcher, Sunnyvale, CA (US); Mostafa Hagog, Folsom, CA (US); and Philippe Vandermersch, San Jose, CA (US)
Assigned to NVIDIA Corporation, Santa Clara, CA (US)
Filed by NVIDIA Corporation, Santa Clara, CA (US)
Filed on Mar. 28, 2024, as Appl. No. 18/620,228.
Application 18/620,228 is a continuation of application No. 16/795,380, filed on Feb. 19, 2020.
Prior Publication US 2024/0256633 A1, Aug. 1, 2024
Int. Cl. G06F 17/16 (2006.01); G06F 9/30 (2018.01); G06N 3/08 (2023.01); G06N 5/046 (2023.01)

CPC G06F 17/16 (2013.01) [G06F 9/3001 (2013.01); G06F 9/30145 (2013.01); G06N 3/08 (2013.01); G06N 5/046 (2013.01)]

20 Claims

1. An acceleration processor unit (APU) comprising:

one or more core complexes, wherein the one or more core complexes include one or more central processing unit (CPU) cores;

one or more graphics complexes, wherein the one or more graphics complexes include one or more compute units and include at least one L2 cache;

one or more fabric interconnects;

one or more memory controllers;

one or more input/output (I/O) bus interfaces, at least one comprising a peripheral component interconnect express (PCIe) interface;

wherein the APU is utilized to implement an application programming interface (API) call to select one or more general matrix-to-matrix multiply (GEMM) implementations from among a plurality of GEMM implementations;

wherein the API call is to use:

input matrices;

a matrix multiply operation descriptor;

a matrix layout descriptor;

a search preferences parameter;

an algorithm count parameter to specify a number of algorithms desired;

a results array parameter;

a results count parameter to store a number of algorithms returned; and

an operation status to indicate whether the API call is successful or if other errors have occurred.