US 12,354,027 B2
Method and system for an intelligent artificial agent
Mark Bishop Ring, Anaheim, CA (US); Satinder Baveja, Ann Arbor, MI (US); Peter Stone, Austin, TX (US); James MacGlashan, Riverside, RI (US); Samuel Barrett, Somerville, MA (US); Roberto Capobianco, Itri (IT); Varun Kompella, Aachen (DE); Kaushik Subramanian, Richmond, CA (US); and Peter Wurman, Acton, MA (US)
Assigned to SONY GROUP CORPORATION, Tokyo (JP)
Filed by Sony Corporation, Tokyo (JP); and Sony Corporation of America, New York, NY (US)
Filed on Apr. 3, 2018, as Appl. No. 15/943,947.
Prior Publication US 2019/0303776 A1, Oct. 3, 2019
Int. Cl. G06N 20/00 (2019.01); G06N 5/043 (2023.01)
CPC G06N 5/043 (2013.01) [G06N 20/00 (2019.01)] 16 Claims
OG exemplary drawing
 
1. A method for training an artificial intelligent agent to recognize a goal configuration, comprising:
placing the agent in the goal configuration and identifying a resulting state as a positive example;
providing negative examples to the agent that demonstrate the agent in a state failing to achieve the goal configuration;
extracting key state features when the agent is in the goal configuration, the key state features including at least one of a room feature, object positioning, ambient lighting, and ambient sounds;
determining what feature categories are important in the goal configuration during receipt of positive examples to the agent;
learning and recognizing, by the agent, the goal configuration based on the extracted key state features and the determined important feature categories;
creating policies, by the agent, based on the learned goal configuration;
converting state features into a distance function to determine how far the agent is from the goal configuration;
using goal detection as a final reward; and
using a goal distance as an intermediate reward.