US 12,293,109 B2
	Memory device with reinforcement learning with Q-learning acceleration
Ran Zamir, Ramat Gan (IL); Ofir Pele, Netanya (IL); Stella Achtenberg, Netanya (IL); and Omer Fainzilber, Herzeliya (IL)
Assigned to Sandisk Technologies, Inc., Milpitas, CA (US)
Filed by Western Digital Technologies, Inc., San Jose, CA (US)
Filed on Mar. 25, 2021, as Appl. No. 17/212,526.
Claims priority of provisional application 63/073,387, filed on Sep. 1, 2020.
Prior Publication US 2022/0066697 A1, Mar. 3, 2022
Int. Cl. G06F 3/06 (2006.01); G06N 20/00 (2019.01)

CPC G06F 3/0659 (2013.01) [G06F 3/0611 (2013.01); G06F 3/0673 (2013.01); G06N 20/00 (2019.01)]

18 Claims

1. A data storage device, comprising:

a non-volatile memory (NVM) device; and

a controller coupled to the NVM device, wherein the controller is configured to:

receive Q table data from the NVM device;

execute a Q-learning algorithm to create updated Q table data;

write the updated Q table data to a first Q table or a second Q table in the NVM device, wherein:

all actions associated with a state of one or more states of the first Q table and the second Q table are generated on a same wordline of the NVM device; and

on the same wordline, each action of the state of the first Q table and each action of the state of the second Q table are alternating; and

read the same wordline of the NVM device, wherein reading the same wordline senses all actions associated with the state of the one or more states of the first Q table and the second Q table.