| CPC G07C 5/02 (2013.01) [B60W 50/06 (2013.01); G06N 7/01 (2023.01); B60W 2050/0018 (2013.01); B60W 60/001 (2020.02)] | 18 Claims |

|
1. A driving decision-making method, comprising:
obtaining, by an autonomous driving vehicle, information of a current driving environment state of the autonomous driving vehicle, wherein the autonomous driving vehicle includes one or more sensors;
constructing, by the autonomous driving vehicle, a Monte Carlo tree based on the current driving environment state, wherein the Monte Carlo tree comprises N nodes, each node represents a corresponding driving environment state, the N nodes comprise a root node and N−1 non-root nodes, the root node represents the current driving environment state, a first driving environment state represented by a first node is predicted by using a stochastic model of driving environments based on a second driving environment state represented by a parent node of the first node and based on a driving action, the driving action is determined by the parent node of the first node in a process of obtaining the first node through expansion, the first node is any node of the N−1 non-root nodes, and N is a positive integer greater than or equal to 2;
determining, by the autonomous driving vehicle, in the Monte Carlo tree based on at least one of an access count or a value function of each node in the Monte Carlo tree, a node sequence, wherein the node sequence comprises a plurality of nodes that starts from the root node and ends at a leaf node;
in response to determining the node sequence, determining, by the autonomous driving vehicle, a driving action sequence of a plurality of future driving steps, wherein each future driving step in the driving action sequence comprises a driving action corresponding to each node comprised in the node sequence, and wherein the driving action sequence is used by the autonomous driving vehicle for driving decision-making;
autonomously driving, by the autonomous driving vehicle, the autonomous driving vehicle based on a first driving action in the driving action sequence;
obtaining, by the autonomous driving vehicle, an actual driving environment state after the first driving action is executed; and
updating, by the autonomous driving vehicle, the stochastic model of driving environments based on the current driving environment state, the first driving action, and the actual driving environment state, wherein
the access count of each node is determined based on access counts of subnodes of the each node and an initial access count of the each node, the value function of the each node is determined based on value functions of subnodes of the each node and an initial value function of the each node, the initial access count of the each node is 1, and the initial value function of the each node is determined based on a value function that matches the corresponding driving environment state represented by the each node.
|