US 11,986,958 B2
Skill templates for robotic demonstration learning
Bala Venkata Sai Ravi Krishna Kolluri, Fremont, CA (US); Stefan Schaal, Mountain View, CA (US); Benjamin M. Davis, Oakland, CA (US); Ralf Oliver Michael Schönherr, San Francisco, CA (US); and Ning Ye, Palo Alto, CA (US)
Assigned to Intrinsic Innovation LLC, Mountain View, CA (US)
Filed by Intrinsic Innovation LLC, Mountain View, CA (US)
Filed on May 21, 2020, as Appl. No. 16/880,862.
Prior Publication US 2021/0362331 A1, Nov. 25, 2021
Int. Cl. G06F 9/445 (2018.01); B25J 9/16 (2006.01); G06F 9/448 (2018.01); G06N 20/00 (2019.01)
CPC B25J 9/163 (2013.01) [B25J 9/161 (2013.01); G06F 9/4498 (2018.02); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, the method comprising:
receiving a skill template for a task to be performed by a robot, wherein the skill template defines a state machine having a plurality of subtasks and one or more respective transition conditions between one or more of the subtasks,
wherein the skill template indicates which of the subtasks are demonstration subtasks to be refined using local demonstration data;
obtaining a base control policy for a demonstration subtask of the skill template;
receiving, for the demonstration subtask of the skill template, local demonstration data generated from a user demonstrating with the robot how to perform the demonstration subtask, wherein the local demonstration data comprises a set of task state representations generated from a plurality of sensor streams while the user demonstrates the demonstration subtask using the robot;
refining the base control policy for the demonstration subtask using the local demonstration data, wherein a refined based control policy for the demonstration subtask is configured to generate, using one or more input sensor streams, a command to be executed by the robot; and
executing the skill template on the robot, thereby causing the robot to transition through the state machine defined by the skill template to perform the task, including performing commands generated by the refined base control policy.