| CPC G05D 1/646 (2024.01) [B62D 57/032 (2013.01)] | 6 Claims |

|
1. A method for controlling motions of a quadruped robot based on reinforcement learning and position increment, comprising:
acquiring motion environment information, quadruped robot attitude information, and foot sole position information;
based on the acquired information, generating foot sole positions of the quadruped robot during motions within all preset time steps, and calculating a change of the foot sole positions in all the time steps;
taking a maximum moving distance within a single time step as a constraint, and accumulating the time steps at the same time to obtain a foot sole position trajectory;
controlling the quadruped robot to perform corresponding actions based on the foot sole position trajectory combined with a preset reward function, so as to keep motion balance of the quadruped robot;
acquiring and processing joint state historical information and leg phase information of the quadruped robot as a control input of the quadruped robot, to obtain a next action command to control the motions of the quadruped robot, wherein
a pressure sensor is not provided on each foot sole of the quadruped robot, the joint state historical information being used as an input of a reinforcement learning policy, to achieve a detection of each foot sole of the quadruped robot to ground contact, and
the joint state historical information includes a joint position error and a joint velocity, the joint position error being a deviation between a current joint position and a previous joint position instruction;
outputting, by an independent policies modulating trajectory generator (PMTG) for each leg, a foot sole position of each leg of the quadruped robot in a Z-axis direction; wherein, the PMTG is defined to simulate a basic stepping gait mode by using cubic Hermite spline, and an equation is as follows:
![]() where, k=2(ϕ−π)/π, h is a maximum allowable foot raising height, and ϕ∈[0,2π) TG phase ϕò[0,2π);
outputting a foot sole position increment and an adjusting frequency of each leg of the quadruped robot based on the reinforcement learning policy; and, a target foot sole position (x, y, z)t at a time t is obtained by the following equation:
![]() where, foot sole positions in an X-axis direction and a Y-axis direction are obtained by accumulating foot sole position increments (Δx, Δy) in the X-axis and Y-axis directions output by the reinforcement learning policy; and, a foot sole position in the Z-axis direction is obtained by superposing a foot sole position increment Δz in the Z-axis direction output by the reinforcement learning policy and a priori value provided by the PMTG.
|