US 11,055,601 B2
System and methods for creation of learning agents in simulated environments
Jason Crabtree, Vienna, VA (US); and Andrew Sellers, Monument, CO (US)
Assigned to QOMPLX, Inc., Tysons, VA (US)
Filed by QOMPLX, Inc., Tysons, VA (US)
Filed on Jan. 29, 2018, as Appl. No. 15/882,180.
Application 15/882,180 is a continuation in part of application No. 15/835,436, filed on Dec. 7, 2017, granted, now 10,572,828.
Application 15/835,436 is a continuation in part of application No. 15/790,457, filed on Oct. 23, 2017, granted, now 10,884,999.
Application 15/790,457 is a continuation in part of application No. 15/790,327, filed on Oct. 23, 2017, granted, now 10,860,951.
Application 15/790,327 is a continuation in part of application No. 15/616,427, filed on Jun. 7, 2017.
Application 15/790,327 is a continuation in part of application No. 15/882,180.
Application 15/882,180 is a continuation in part of application No. 15/835,312, filed on Dec. 7, 2017.
Application 15/835,312 is a continuation in part of application No. 15/186,453, filed on Jun. 18, 2016.
Application 15/186,453 is a continuation in part of application No. 15/166,158, filed on May 26, 2016.
Application 15/166,158 is a continuation in part of application No. 15/141,752, filed on Apr. 28, 2016, granted, now 10,860,962.
Application 15/790,327 is a continuation in part of application No. 15/141,752.
Application 15/141,752 is a continuation in part of application No. 15/091,563, filed on Apr. 5, 2016, granted, now 10,204,147.
Application 15/091,563 is a continuation in part of application No. 14/986,536, filed on Dec. 31, 2015, granted, now 10,210,255.
Application 14/986,536 is a continuation in part of application No. 14/925,974, filed on Oct. 28, 2015, abandoned.
Claims priority of provisional application 62/568,298, filed on Oct. 4, 2017.
Claims priority of provisional application 62/568,291, filed on Oct. 4, 2017.
Prior Publication US 2018/0300598 A1, Oct. 18, 2018
Int. Cl. G06N 3/00 (2006.01); G06N 5/04 (2006.01); G06N 20/00 (2019.01); H04L 29/08 (2006.01); G06Q 10/06 (2012.01)
CPC G06N 3/006 (2013.01) [G06N 5/043 (2013.01); G06N 20/00 (2019.01); G06Q 10/0637 (2013.01); H04L 67/10 (2013.01)] 3 Claims
OG exemplary drawing
 
1. A system for generating learning agents in simulated environments, comprising:
a computing device comprising a memory and a processor;
an agent creation engine comprising a first plurality of programming instructions stored in the memory and operating on the processor, wherein the first plurality of programming instructions, when operating on the processor, cause the computing device to:
receive one or more agent goals;
select an agent simulation based on the one or more agent goals;
create a plurality of agents, each agent being an instance of the agent simulation and having one of the one or more goals; and
at least one simulation manager comprising a second plurality of programming instructions stored in the memory and operating on the processor, wherein the second plurality of programming instructions, when operating on the processor, cause the computing device to:
receive a simulation goal related to the one or more agent goals;
select a dynamic environment simulation based on the simulation goal;
execute the dynamic environment simulation using a plurality of meta-models, wherein the plurality of simulation agents used in the execution is managed according to at least one meta-model, wherein the at least one meta-model comprises a plurality of relationships between the dynamic environment simulation and the agent simulation; and
continue execution of the dynamic environment simulation that evolves with agent behavior from the execution of the plurality of agents and the at least one meta-model until the simulation goal has been reached or until each agent has achieved its agent goal from the one or more agent goals; wherein each of the plurality of agents takes different actions in a non-deterministic environment based on its specifications and the specifications of the dynamic environment simulation to achieve the one or more agent goals; wherein the actions taken by the agents include hierarchical learning according to set agent goals, learning patterns of user behavior and learned behavior of agents stored from previous simulations; wherein the different actions and learned behaviors acquired by individual agents further differentiating them from each other during the simulation execution; and
examine the outcomes of each individual set of actions taken by each agent according to the simulation goal.