US 11,861,328 B2
Processor for fine-grain sparse integer and floating-point operations
Ali Shafiee Ardestani, Santa Clara, CA (US); and Joseph H. Hassoun, San Jose, CA (US)
Assigned to Samsung Electronics Co., Ltd., Yongin-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Dec. 23, 2020, as Appl. No. 17/133,288.
Claims priority of provisional application 63/112,299, filed on Nov. 11, 2020.
Prior Publication US 2022/0147313 A1, May 12, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 7/487 (2006.01); G06F 7/485 (2006.01); G06F 7/544 (2006.01); G06N 3/063 (2023.01)
CPC G06F 7/4876 (2013.01) [G06F 7/485 (2013.01); G06F 7/5443 (2013.01); G06N 3/063 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for performing computations by a neural network via a processing circuit, the method comprising:
identifying, by the processing circuit, a first plurality of weights having a first arrangement;
generating, by the processing circuit, a first set of products, each product of the first set of products being an integer product of a first activation value and a respective weight of a first plurality of weights;
generating, by the processing circuit, a second set of products, each product of the second set of products being a floating-point product of a second activation value and a respective weight of a second plurality of weights; and
outputting, by the processing circuit, at least one of the first set of products or the second set of products for use by the neural network,
each of the weights of the first plurality of weights including a least significant sub-word and a most significant sub-word,
the most significant sub-word of a first weight of the first plurality of weights being nonzero,
the most significant sub-word of a second weight of the first plurality of weights being zero,
the generating of the first set of products comprising:
processing the first plurality of weights to have a second arrangement different from the first arrangement;
storing the first plurality of weights arranged according to the second arrangement into a first memory space;
multiplying, in a first multiplier, the first activation value by the least significant sub-word of the first weight stored in the first memory space to form a first partial product;
multiplying, in a second multiplier, the first activation value by the least significant sub-word of the second weight stored in the first memory space;
multiplying, in a third multiplier, the first activation value by the most significant sub-word of the first weight to form a second partial product; and
adding the first partial product and the second partial product;
the forming of the second set of products comprising forming a first floating point product,
the forming of the first floating point product comprising multiplying, in the first multiplier, a first sub-word of a mantissa of the second activation value by a first sub-word of a mantissa of a first weight of the second plurality of weights, to form a third partial product.