| CPC B25J 9/1664 (2013.01) [B25J 9/163 (2013.01); B25J 9/1653 (2013.01); B25J 9/161 (2013.01)] | 19 Claims |

|
1. A memory comprising machine readable instructions to cause at least one processor circuit to:
train a reward function with reinforcement learning, the reward function to define a robot's activities in an environment;
deploy the reward function in the robot to cause the robot to move in the environment in accordance with the reward function;
access reward feedback based on the robot movement; and
process the reward feedback to update the reward function.
|