US 12,333,389 B2
Autonomous vehicle system for intelligent on-board selection of data for training a remote machine learning model
Shaojun Zhu, Pittsburgh, PA (US); Richard L. Kwant, San Bruno, CA (US); and Nicolas Cebron, Sunnyvale, CA (US)
Assigned to Volkswagen Group of America Investments, LLC, Reston, VA (US)
Filed by Volkswagen Group of America Investments, LLC, Reston, VA (US)
Filed on Dec. 16, 2020, as Appl. No. 17/124,413.
Prior Publication US 2022/0188695 A1, Jun. 16, 2022
Int. Cl. G06N 20/00 (2019.01)
CPC G06N 20/00 (2019.01) 22 Claims
OG exemplary drawing
 
1. A method for on-board selection of data logs for training a machine learning model, comprising, by an on-board computing device of an autonomous vehicle:
receiving, from a plurality of sensors, a plurality of sensor data logs corresponding to surroundings of the autonomous vehicle;
identifying one or more events within each of the plurality of sensor data logs;
identifying a property of a plurality of different properties of the machine learning model that is to be trained;
identifying a first event of the one or more events which is missing or under-represented in training data for training the identified property of the machine learning model;
for each sensor data log of the plurality of sensor data logs associated with the identified first event:
analyzing features of the identified one or more events within that sensor data log for determining whether that sensor data log satisfies one or more usefulness criteria for training the identified property of the machine learning model by comparing a first event identified using a first sensor data log collected by a first sensor of the plurality of sensors and a corresponding event identified using a second sensor data log collected by a second sensor of the plurality of sensors, wherein the features comprise at least spatial features,
analyzing whether a difference between the first event and the corresponding event is more than a threshold,
determining that the first sensor data log and the second sensor data log are spatially inconsistent if the difference between the first event and the corresponding event is more than the threshold,
determining that the first sensor data log or the second sensor data log satisfies the one or more usefulness criteria for training the identified property of the machine learning model if an actual accuracy of the machine learning model will improve upon training using spatially inconsistent first sensor data log and second sensor data log, and
in response to determining that that sensor data log satisfies one or more usefulness criteria for training the machine learning model, transmitting that sensor data log to a remote computing device for training the machine learning model;
performing operations by a control system to control, using the machine learning model trained using the transmitted sensor data log, movement of the autonomous vehicle along a navigation route;
receiving information relating to an effectiveness of the sensor data log in training the machine learning model; and
modifying, based on the received information, a weight or frequency of use associated with at least one data log selection strategy of a plurality of different data log selection strategies that can be used during said analyzing features.