US 11,669,731 B2
Solving based introspection to augment the training of reinforcement learning agents for control and planning on robots and autonomous vehicles
Michael A. Warren, Northridge, CA (US); and Christopher Serrano, Glendora, CA (US)
Assigned to HRL LABORATORIES, LLC, Malibu, CA (US)
Filed by HRL Laboratories, LLC, Malibu, CA (US)
Filed on Nov. 21, 2019, as Appl. No. 16/691,446.
Claims priority of provisional application 62/792,352, filed on Jan. 14, 2019.
Prior Publication US 2020/0226464 A1, Jul. 16, 2020
Int. Cl. G06N 3/08 (2023.01); G06N 3/088 (2023.01); G06N 3/048 (2023.01)
CPC G06N 3/08 (2013.01) [G06N 3/088 (2013.01); G06N 3/048 (2023.01)] 12 Claims
OG exemplary drawing
 
1. A system for controlling a mobile platform, the system comprising:
the mobile platform having one or more sensors thereon; and
one or more processors and a non-transitory computer-readable medium having executable instructions encoded thereon such that when executed, the one or more processors perform operations of:
determining a current states of the mobile platform via the one or more sensors;
initially training a neural network π that is integrated on the mobile platform, wherein the initial training is based on the current states of the mobile platform;
querying a Satisfiability Modulo Theories (SMT) solver when it is determined that a current increment step is on a query schedule,
wherein the query schedule determines when to query the SMT solver to generate a plurality of examples of states satisfying specified constraints of the mobile platform;
modifying the initial training of the neural network π based on the plurality of examples of states;
following training on the plurality of examples of states, selecting an action to be performed by the mobile platform in its environment,
wherein the action is selected from a probability distribution π(s) over a space of valid actions that the mobile platform can take while in the current states; and
causing the mobile platform to perform the selected action in its environment.