| CPC G06N 3/08 (2013.01) [G06F 40/205 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01)] | 20 Claims |

|
1. A computer-implemented method for reinforcement learning (RL) with Logical Neural Networks (LNNs), the method comprising:
receiving a plurality of observation text sentences from a target environment;
extracting one or more propositional logic values from the plurality of observation text sentences;
finding a class for each propositional logic value by using external knowledge;
converting, by an FOL converter, each propositional logic value into a first-order logic (FOL) by replacing a part in the propositional logic value with a variable word, the part indicating the class;
selecting a LNN based on the class among LNNs prepared in advance for each class, each LNN receiving the one or more propositional logic values as a status input and outputting an action with a score indicating a degree of preference for taking the action; and
performing a highest score action to the target environment to obtain a next state of the target environment and a reward for the highest score action.
|