| CPC B25J 9/1664 (2013.01) | 17 Claims |

|
1. A system for object manipulation, comprising:
a planner performing:
object trajectory planning through object path planning and object trajectory optimization; and
grasp sequence planning by generating a grasp sequence by a first type of grasp sequence planning when an initial grasp and a final grasp are given and a second type of grasp sequence planning when no final grasp is given,
wherein the first type of grasp sequence planning is based on an optimized cost function for the planner,
wherein the second type of grasp sequence planning is based on a deep reinforcement learning (DRL) policy trained based on a reward function for the planner associated with the optimized cost function; and
a controller, executing the object trajectory and the grasp sequence via a robot appendage including a robot finger and one or more actuators,
wherein the optimized cost function includes a cost for reachability of contact points, a wrench error for realizing object trajectory, a contact force magnitude, a cost for transition between consecutive grasps, and a sliding cost term including a cost for sliding the robot finger based on a sliding distance, a change in a desired normal direction, and a wrench error associated with sliding the robot finger, and
wherein the reward function is based on the cost for reachability of contact points, the wrench error for realizing object trajectory, the contact force magnitude, a transition penalty for adding or sliding the robot finger already at a desired location, a penalty for removing or sliding a robot finger which is not yet in contact with an object, a penalty for the sliding distance, the change in desired normal direction, and the wrench error associated with sliding the robot finger.
|