US 11,992,943 B2
	Method for optimizing a policy for a robot
Lukas Froehlich, Freiburg (DE); Edgar Klenske, Renningen (DE); and Leonel Rozo, Boeblingen (DE)
Assigned to ROBERT BOSCH GMBH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on Oct. 13, 2021, as Appl. No. 17/450,794.
Claims priority of application No. 102020213527.4 (DE), filed on Oct. 28, 2020.
Prior Publication US 2022/0126441 A1, Apr. 28, 2022
Int. Cl. G05B 13/00 (2006.01); B25J 9/16 (2006.01); G05B 13/04 (2006.01)

CPC B25J 9/1605 (2013.01) [G05B 13/042 (2013.01)]

12 Claims

1. A method for optimizing a predefined policy for a robot, the policy being a Gaussian mixture model that outputs at least one partial trajectory as a function of a starting state and a target state of the robot, the method comprising the following steps:

initializing a Gaussian process that is suitable for estimating, as a function of a parameterization of the Gaussian mixture model, costs that the robot must incur in order to reach the target state, the Gaussian process including at least one kernel which obtains an input parameter as a function of a distance that is ascertained between probability distributions, which are characterized in each case by the Gaussian mixture model and the Gaussian process, according to a probability product kernel;

creating a plurality of trajectories as a function of the policy;

ascertaining the costs for each of the trajectories of the plurality of trajectories;

optimizing the Gaussian process in such a way that, for the plurality of trajectories, the Gaussian process estimates the ascertained costs as a function of used parameters of the Gaussian mixture model;

ascertaining optimal parameters for the Gaussian mixture model with the aid of the Gaussian process, so that the Gaussian process outputs optimal costs for the optimal parameters;

replacing the parameters of the Gaussian mixture model with the optimal parameters;

supplying sensor signals to the Gaussian mixture model;

outputting, by the Gaussian mixture model, a partial trajectory, using the supplied sensor signals;

ascertaining an actuation signal as a function of the partial trajectory; and

activating and controlling an actuator of the robot using the actuation signal.