CPC G06T 7/60 (2013.01) [B60W 40/04 (2013.01); B60W 50/0205 (2013.01); B60W 60/001 (2020.02); G06T 7/20 (2013.01); G06V 10/764 (2022.01); G06V 10/98 (2022.01); G06V 20/56 (2022.01); B60W 2554/20 (2020.02); B60W 2554/4026 (2020.02); B60W 2554/4029 (2020.02); B60W 2554/4044 (2020.02); G06T 2207/20081 (2013.01); G06T 2207/30252 (2013.01); G06V 2201/07 (2022.01)] | 20 Claims |
1. A system comprising:
one or more processors; and
one or more non-transitory computer-readable media storing instructions executable by one or more processors, wherein the instructions, when executed, cause the system to perform operations comprising:
receiving sensor data representing an object in an environment;
determining, based at least in part on the sensor data, multi-channel input data representing the environment;
inputting the multi-channel input data into a machine-learned (ML) model;
determining, by the ML model, a candidate bounding box representing the object in the environment and a confidence value associated with the candidate bounding box;
receiving ground truth data associated with the multi-channel input data, the ground truth data including a ground truth bounding box associated with the object;
determining an intersection over union (IoU) between the candidate bounding box the ground truth bounding box;
determining a first loss based at least in part on the IoU:
determining a second loss based at least in part on the confidence value associated with the candidate bounding box; and
training the ML model based at least in part on the first loss and the second loss.
|