US 11,657,268 B1
Training neural networks to assign scores
Khaled Refaat, Mountain View, CA (US); and Kai Ding, Santa Clara, CA (US)
Assigned to Waymo LLC, Mountain View, CA (US)
Filed by Waymo LLC, Mountain View, CA (US)
Filed on Sep. 27, 2019, as Appl. No. 16/586,257.
Int. Cl. G06N 3/08 (2023.01); G06N 3/04 (2023.01); G05D 1/02 (2020.01); G05D 1/00 (2006.01)
CPC G06N 3/08 (2013.01) [G06N 3/04 (2013.01); G05D 1/0088 (2013.01); G05D 1/0221 (2013.01); G05D 2201/0213 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method of training a neural network having a plurality of network parameters and configured to receive a network input and to process the network input in accordance with the network parameters to generate a network output that assigns a respective score to each of a plurality of locations in the network input, the method comprising:
obtaining a training input and a corresponding ground truth output that assigns a respective ground truth score for each of the plurality of locations in the training input, wherein the plurality of locations in the training input include corresponding locations of one or more agents that are in an environment within a vicinity of a vehicle, and wherein the respective ground truth scores include a respective importance score for each of the one or more agents;
processing the training input using the neural network and in accordance with current values of the network parameters to generate a training output that assigns a respective training score to each of the plurality of locations in the training input;
computing a loss for the training input, comprising:
selecting a plurality of candidate locations from the plurality of locations;
setting to zero the training scores for any location in the selected candidate locations that has a ground truth score below a threshold value;
for each of a plurality of pairs of locations in the selected candidate locations:
computing a pair-wise loss for the pair based on the ground truth scores and the training scores for the pair of locations; and
combining the pair-wise losses for the pairs of locations to compute the loss for the training input; and
determining an update to the current values of the parameters by determining a gradient of the loss with respect to the network parameters.