CPC G06N 3/006 (2013.01) [G06F 18/2111 (2023.01); G06F 18/2132 (2023.01); G06F 18/217 (2023.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] | 20 Claims |
1. A computed-implemented method for training an agent to perform a reinforcement learning task, the method comprising:
obtaining a plurality of demonstration sequences, each of the demonstration sequences being a sequence of images of an environment while a respective instance of the reinforcement learning task is being performed;
for each demonstration sequence, processing each image in the demonstration sequence through an image processing neural network comprising a plurality of hidden layers to determine feature values for a respective set of features for the image from activations generated by one or more of the hidden layers;
determining, from the demonstration sequences, a partitioning of the reinforcement learning task into a plurality of subtasks, wherein each image in each demonstration sequence is assigned to a respective subtask of the plurality of subtasks;
determining, from the feature values for the images in the demonstration sequences, a respective set of discriminative features for each of the plurality of subtasks; and
training the agent to perform the reinforcement learning task using one or more perception-based rewards computed from image feature values for the discriminative features for the subtasks.
|