US 12,190,231 B2
	Method and apparatus for neural network quantization
Yoo Jin Choi, San Diego, CA (US); Mostafa El-Khamy, San Diego, CA (US); and Jungwon Lee, San Diego, CA (US)
Assigned to Samsung Electronics Co., Ltd, (KR)
Filed by Samsung Electronics Co., Ltd., Gyeonggi-do (KR)
Filed on Sep. 6, 2017, as Appl. No. 15/697,035.
Application 15/697,035 is a continuation in part of application No. 15/433,531, filed on Feb. 15, 2017, granted, now 11,321,609.
Claims priority of provisional application 62/480,857, filed on Apr. 3, 2017.
Claims priority of provisional application 62/409,961, filed on Oct. 19, 2016.
Prior Publication US 2018/0107926 A1, Apr. 19, 2018
Int. Cl. G06N 3/08 (2023.01); G06F 7/58 (2006.01); G06N 3/047 (2023.01); G06N 3/063 (2023.01)

CPC G06N 3/08 (2013.01) [G06F 7/582 (2013.01); G06N 3/047 (2023.01); G06N 3/063 (2013.01)]

20 Claims

1. A method for network quantization using multi-dimensional vectors, comprising:

constructing multi-dimensional vectors representing network parameters from a trained neural network model;

quantizing the multi-dimensional vectors to obtain shared quantized vectors as cluster centers;

individually fine-tuning each dimensional element of each of the shared quantized vectors/cluster centers with a stochastic gradient descent method using an average gradient of a network loss function for a respective divided group of the network parameters corresponding to the dimensional element; and

encoding using the shared quantized vectors/cluster centers.