US 12,260,328 B2
Neuro-symbolic reinforcement learning with first-order logic
Daiki Kimura, Kanagawa (JP); Masaki Ono, Tokyo (JP); Subhajit Chaudhury, Kawasaki (JP); and Michiaki Tatsubori, Oiso (JP)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Oct. 5, 2021, as Appl. No. 17/494,055.
Prior Publication US 2023/0108135 A1, Apr. 6, 2023
Int. Cl. G06N 3/08 (2023.01); G06F 40/205 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01)
CPC G06N 3/08 (2013.01) [G06F 40/205 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for reinforcement learning (RL) with Logical Neural Networks (LNNs), the method comprising:
receiving a plurality of observation text sentences from a target environment;
extracting one or more propositional logic values from the plurality of observation text sentences;
finding a class for each propositional logic value by using external knowledge;
converting, by an FOL converter, each propositional logic value into a first-order logic (FOL) by replacing a part in the propositional logic value with a variable word, the part indicating the class;
selecting a LNN based on the class among LNNs prepared in advance for each class, each LNN receiving the one or more propositional logic values as a status input and outputting an action with a score indicating a degree of preference for taking the action; and
performing a highest score action to the target environment to obtain a next state of the target environment and a reward for the highest score action.