US 12,066,890 B2
Error-tolerant memory system for machine learning systems
Sudhanva Gurumurthi, Austin, TX (US); and Ganesh Suryanarayan Dasika, Austin, TX (US)
Assigned to Advanced Micro Devices, Inc., Santa Clara, CA (US)
Filed by Advanced Micro Devices, Inc., Santa Clara, CA (US)
Filed on Mar. 25, 2022, as Appl. No. 17/704,474.
Prior Publication US 2023/0305923 A1, Sep. 28, 2023
Int. Cl. G11C 29/00 (2006.01); G06F 11/07 (2006.01); G06F 11/10 (2006.01); G06N 20/00 (2019.01)
CPC G06F 11/1068 (2013.01) [G06F 11/076 (2013.01); G06F 11/0772 (2013.01); G06F 11/1004 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
receiving an error detection code that indicates an error has occurred within a region of physical memory in an accelerator executing a machine learning system;
determining whether a threshold number of errors have been detected within the region of physical memory; and
outputting, in response to the threshold number of errors having been detected, a notification that the threshold number of errors have been detected within the region of physical memory.