US 12,412,117 B2
	Simulation modeling exchange
Sahika Genc, Mercer Island, WA (US); Sunil Mallya Kasaragod, San Francisco, CA (US); Leo Parker Dirac, Seattle, WA (US); Bharathan Balaji, Seattle, WA (US); and Saurabh Gupta, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Nov. 27, 2018, as Appl. No. 16/201,864.
Prior Publication US 2020/0167687 A1, May 28, 2020
Int. Cl. G06N 20/00 (2019.01); B25J 9/16 (2006.01); G06F 30/20 (2020.01); G06F 30/27 (2020.01); G06N 3/006 (2023.01)

CPC G06N 20/00 (2019.01) [B25J 9/1605 (2013.01); B25J 9/163 (2013.01); B25J 9/1671 (2013.01); G06F 30/20 (2020.01); G06F 30/27 (2020.01); G05B 2219/33056 (2013.01); G06N 3/006 (2013.01)]

21 Claims

6. A first system, comprising:

one or more processors; and

memory that stores computer-executable instructions that, if executed, cause the first system to:

execute a simulation of a robotic device in a simulation environment, the simulation comprising an agent representing a second system using a reinforcement learning model to operate within the simulation environment;

obtain data indicating how the agent performed in the simulation environment;

transmit the data to another system to cause the other system to:

run a model training application;

use the data generated by the agent to update the reinforcement learning model to produce an updated reinforcement learning model; and

provide the updated reinforcement learning model to the agent at the simulation environment;

obtain, by the agent, the updated reinforcement learning models;

execute the simulation of the system according to the updated reinforcement learning model;

obtain a notification from the other system that indicates that a termination requirement for the simulation has been satisfied; and

make available the updated reinforcement learning model for optimizing an application of the second system in response to the notification.