US 11,701,041 B2
	Robotic interactions for observable signs of intent
Sławomir Wojciechowski, Wrocław (PL); Gregg Podnar, Lakewood, OH (US); T. William Mather, Philadelphia, PA (US); Theodore Enns, Menlo Park, CA (US); and Daniel Oblinger, San Francisco, CA (US)
Assigned to AEOLUS ROBOTICS, INC., San Jose, CA (US)
Filed by AEOLUS ROBOTICS, INC., South San Francisco, CA (US)
Filed on May 23, 2019, as Appl. No. 16/421,126.
Claims priority of provisional application 62/675,729, filed on May 23, 2018.
Claims priority of provisional application 62/675,730, filed on May 23, 2018.
Prior Publication US 2019/0358820 A1, Nov. 28, 2019
Int. Cl. A61B 5/16 (2006.01); B25J 11/00 (2006.01); A61B 5/00 (2006.01); B25J 9/16 (2006.01); G16H 50/20 (2018.01); B25J 9/00 (2006.01); G10L 15/22 (2006.01); G06V 20/00 (2022.01); G06V 40/20 (2022.01); G16H 50/30 (2018.01); A61B 5/11 (2006.01); G06V 10/80 (2022.01); G06V 20/10 (2022.01)

CPC A61B 5/165 (2013.01) [A61B 5/0077 (2013.01); A61B 5/1113 (2013.01); A61B 5/1128 (2013.01); A61B 5/4803 (2013.01); A61B 5/7267 (2013.01); A61B 5/746 (2013.01); B25J 9/0003 (2013.01); B25J 9/163 (2013.01); B25J 9/1661 (2013.01); B25J 9/1697 (2013.01); B25J 11/0005 (2013.01); B25J 11/008 (2013.01); B25J 11/009 (2013.01); G06V 10/811 (2022.01); G06V 20/10 (2022.01); G06V 20/35 (2022.01); G06V 40/20 (2022.01); G06V 40/28 (2022.01); G10L 15/22 (2013.01); G16H 50/20 (2018.01); G16H 50/30 (2018.01); A61B 2560/0242 (2013.01); A61B 2560/0252 (2013.01); A61B 2560/0257 (2013.01); A61B 2560/0261 (2013.01); A61B 2562/0247 (2013.01); A61B 2562/0271 (2013.01); G10L 2015/223 (2013.01); G10L 2015/228 (2013.01)]

20 Claims

1. A robotic device, comprising:

a plurality of sensors configured to:

capture a set of one or more images of an individual,

record sound signals of the individual, and

generate environment data of an environment surrounding the individual;

a processor; and

memory storing instructions configured to cause the processor to perform:

recognizing a set of actions performed by the individual from the set of one or more images of the individual;

recognizing verbal content from the individual from the sound signals;

inputting the set of actions performed by the individual, past actions of the individual, the verbal content from the individual, and the environment data to a machine learning model to predict the individual's intent, wherein training of the machine learning model comprises:

initiating the machine learning model with a set of parameters and an objective function, the set of parameters comprising a first set of parameters related to sensor data captured by the plurality of sensors, a second set of parameters related to individual's schedules, a third set of parameters related to typical human behaviors;

generating a first set of training data related to past observed situations, the first set of training data comprising the past actions of the individual, sensor data captured by the plurality of sensors capturing during the past observed situations, and labels of identified intents corresponding to the past observed situations;

capturing individual's schedules, the individual's schedules comprising time and locations of past individual actions;

generating a second set of training data related to one or more daily time distributions of the individual's schedules;

generating a third set of training data related to publicly available information of human typical behavior in one or more situations;

inputting the first set of training data related to past observed situations, the second set of training data related to one or more daily time distributions of the individual's schedules, the third set of training data related to publicly available information of human typical behavior to the machine learning model for the machine learning model to determine predicted intents of the past observed situations;

determining a value of the objective function that measures a degree of matching between the predicted intents of the past observed situations and the labels of identified intents recorded in the past observed situations; and

adjusting the first set of parameters related to sensor data captured by the plurality of sensors, the second set of parameters related to individual's schedules, the third set of parameters related to typical human behaviors of the machine learning model based on the value of the objective function;

identifying one or more tasks corresponding to the individual's intent; and

determining a set of actions to be performed by the robotic device based on the one or more tasks.