US 12,032,343 B2
Control system for controlling a machine using a control agent with parallel training of the control agent
Dietrich Baehring, Stuttgart-Ost (DE); Kirolos Samy Attia Abdou, Stuttgart (DE); Klaus Weber, Stuttgart (DE); and Ricardo Esteves Borges, Unterroth (DE)
Assigned to ROBERT BOSCH GMBH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on Aug. 10, 2021, as Appl. No. 17/398,320.
Claims priority of application No. 102020210823.4 (DE), filed on Aug. 27, 2020.
Prior Publication US 2022/0066401 A1, Mar. 3, 2022
Int. Cl. G05B 13/02 (2006.01); B25J 9/16 (2006.01); G06N 3/045 (2023.01)
CPC G05B 13/027 (2013.01) [B25J 9/161 (2013.01); G06N 3/045 (2023.01)] 6 Claims
OG exemplary drawing
 
1. A machine control system comprising:
a first processing device that includes a first processor; and
a second processing unit that includes at least one second processor;
wherein:
the machine control system is configured to perform a method for controlling a machine, the method comprising, during a control session in which the machine is controlled to achieve a target state from an initial state:
in a first iterative process including a plurality of iterations, the first processor performing the following in each of the iterations of the first iterative process:
executing a first instance of a control agent to select a respective control action
communicating one or more control commands to the machine for carrying out the selected control action;
receiving respective sensor data from the machine; and
communicating state information about a state of the machine and/or an environment of the machine that is based on the received sensor a memory that is accessible by the at least one second processor;
in a second iterative process including a plurality of iterations, the at least one second processor performing the following in each of the iterations of the second iterative process:
performing an iterative reinforcement learning using the state information that has been communicated to the memory during the control session to ascertain an update for the first instance of the control agent by operating on a second instance of the control agent; and
communicating the update to the first data processing device; and
updating the first instance of the control agent by the first data processing device according to the update communicated from the at least one second processor;
the first iterative process and the second iterative process are run in parallel, so that different ones of the iterations of the first iterative process operate on different versions of the first instance of the control agent due to the updating performed based on the communicated updates of the iterations of the second iterative process; and
the at least one second processor includes at least one of (a) one or more graphics processing units (GPUs) and (b) one or more tensor processing units (TPU).