| CPC G05B 13/027 (2013.01) | 13 Claims |

|
1. A method for controlling an agent, the method comprising the following steps:
obtaining numerical values of a first set of state variables and a second set of state variables,
wherein:
i) the first set of state variables comprises a set of state variables that represent the agent's state and that are selected depending on relevancy to a goal set to be achieved by the agent;
ii) the second set of state variables comprises a set of state variables that represent sensory information of the agent's state and that are selected depending on relevancy to the goal;
iii) the second set of state variables have a higher resolution than the first set of state variables;
iii) the numerical values of the first set of state variables together with the numerical values of the second set of variables represent a current full state of the agent; and
iv) the numerical values of the first set of state variables represent a current partial state of the agent;
determining a state value prior, wherein the state value prior includes, for each of a plurality of potential partial states into which the agent can be transitioned after being in the current partial state, a respective prior estimation of a value or cost of being in the respective potential state for the agent to achieve the goal, the prior estimations being generated based on a model of an environment;
supplying an input to a neural network, wherein the input includes:
a local crop of the state value prior, the local crop being a sub-section of the state value prior centered at the current partial state of the agent and having a pre-defined extent from the current partial state;
the numerical values of the second set of state variables; and
the numerical values of the first set of state variables;
outputting, by the neural network and based on the supplied input, a respective value or cost for each of a plurality of control actions to perform while the agent is at the current state of the agent;
selecting one or more of the control actions based on the values or costs output by the neural network; and
controlling the agent in accordance with control signals for executing the selected one or more of the control actions.
|