US 12,436,509 B2
Device and method for controlling an agent
Jan Guenter Woehlke, Leonberg (DE); Felix Schmitt, Ludwigsburg (DE); and Herke van Hoof, Diemen (NL)
Assigned to ROBERT BOSCH GMBH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on Aug. 30, 2022, as Appl. No. 17/898,846.
Claims priority of application No. 10 2021 210 533.5 (DE), filed on Sep. 22, 2021.
Prior Publication US 2023/0090127 A1, Mar. 23, 2023
Int. Cl. G05B 13/02 (2006.01)
CPC G05B 13/027 (2013.01) 13 Claims
OG exemplary drawing
 
1. A method for controlling an agent, the method comprising the following steps:
obtaining numerical values of a first set of state variables and a second set of state variables,
wherein:
i) the first set of state variables comprises a set of state variables that represent the agent's state and that are selected depending on relevancy to a goal set to be achieved by the agent;
ii) the second set of state variables comprises a set of state variables that represent sensory information of the agent's state and that are selected depending on relevancy to the goal;
iii) the second set of state variables have a higher resolution than the first set of state variables;
iii) the numerical values of the first set of state variables together with the numerical values of the second set of variables represent a current full state of the agent; and
iv) the numerical values of the first set of state variables represent a current partial state of the agent;
determining a state value prior, wherein the state value prior includes, for each of a plurality of potential partial states into which the agent can be transitioned after being in the current partial state, a respective prior estimation of a value or cost of being in the respective potential state for the agent to achieve the goal, the prior estimations being generated based on a model of an environment;
supplying an input to a neural network, wherein the input includes:
a local crop of the state value prior, the local crop being a sub-section of the state value prior centered at the current partial state of the agent and having a pre-defined extent from the current partial state;
the numerical values of the second set of state variables; and
the numerical values of the first set of state variables;
outputting, by the neural network and based on the supplied input, a respective value or cost for each of a plurality of control actions to perform while the agent is at the current state of the agent;
selecting one or more of the control actions based on the values or costs output by the neural network; and
controlling the agent in accordance with control signals for executing the selected one or more of the control actions.