US 12,387,767 B2
Method and apparatus for quantization and dequantization of neural network input and output data using processing-in-memory
Ioannis Papadopoulos, Boxorough, MA (US); Vignesh Adhinarayanan, Austin, TX (US); Ashwin Aji, Santa Clara, CA (US); and Jagadish B Kotra, Austin, TX (US)
Assigned to Advanced Micro Devices, Inc., Santa Clara, CA (US)
Filed by Advanced Micro Devices, Inc., Santa Clara, CA (US)
Filed on Jun. 30, 2023, as Appl. No. 18/346,110.
Prior Publication US 2025/0006232 A1, Jan. 2, 2025
Int. Cl. G11C 7/10 (2006.01); G06F 3/06 (2006.01)
CPC G11C 7/1006 (2013.01) [G06F 3/0611 (2013.01); G06F 3/0656 (2013.01); G06F 3/0688 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
circuitry configured to:
store, in a given row of a memory array comprising a plurality of rows, a first data value having a first magnitude using a first precision;
receive a first read access request targeting the first data value,
retrieve at least the first data value from the given row;
replace the first magnitude of the first data value with a second magnitude of the first data value using a second precision less than the first precision; and
send the first data value having the second magnitude to a requester that generated the first read access request, in response to the first read access request being a quantized read access.