US 12,406,169 B2
	Optimally clipped tensors and vectors
Charbel Sakr, Mountain View, CA (US); Steve Haihang Dai, Union City, CA (US); Brucek Kurdo Khailany, Austin, TX (US); William James Dally, Incline Village, NV (US); Rangharajan Venkatesan, San Jose, CA (US); and Brian Matthew Zimmer, Berkeley, CA (US)
Assigned to NVIDIA Corporation, Santa Clara, CA (US)
Filed by NVIDIA Corporation, Santa Clara, CA (US)
Filed on Jul. 26, 2022, as Appl. No. 17/814,957.
Claims priority of provisional application 63/303,899, filed on Jan. 27, 2022.
Prior Publication US 2023/0237308 A1, Jul. 27, 2023
Int. Cl. G06F 7/00 (2006.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01)

CPC G06N 3/04 (2013.01) [G06N 3/08 (2013.01)]

26 Claims

1. A computer-implemented method for quantizing tensors of a neural network model comprising multiple processing layers, comprising:

computing first clipping scalars for quantizing first tensors of a first processing layer that is coupled between two processing layers of the multiple processing layers;

processing an input by the neural network model, according to quantized tensors that include the quantized first tensors, by each processing layer of the multiple processing layers in sequence to produce intermediate tensors and an output of the neural network model;

adjusting the first tensors based on a loss gradient; and

updating the first clipping scalars based on a mean squared error to reduce differences between the adjusted first tensors and quantized adjusted first tensors.