US 11,654,915 B2
Method of generating vehicle control data, vehicle control device, and vehicle control system
Yohsuke Hashimoto, Nagakute (JP); Akihiro Katayama, Toyota (JP); Yuta Oshiro, Nagoya (JP); Kazuki Sugie, Toyota (JP); and Naoya Oka, Nagakute (JP)
Assigned to TOYOTA JIDOSHA KABUSHIKI KAISHA, Toyota (JP)
Filed by TOYOTA JIDOSHA KABUSHIKI KAISHA, Toyota (JP)
Filed on Oct. 8, 2020, as Appl. No. 16/948,973.
Claims priority of application No. JP2019-191093 (JP), filed on Oct. 18, 2019.
Prior Publication US 2021/0114596 A1, Apr. 22, 2021
Int. Cl. B60W 30/182 (2020.01); B60W 10/06 (2006.01); G06F 18/21 (2023.01)
CPC B60W 30/182 (2013.01) [B60W 10/06 (2013.01); G06F 18/217 (2023.01); B60W 2510/0604 (2013.01); B60W 2520/105 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A method of generating vehicle control data that is applied to a vehicle configured to select one of a plurality of traveling control modes and is executed by a processor in a state in which relationship definition data defining a relationship between a state of the vehicle and an action variable as a variable relating to an operation of electronic equipment in the vehicle is stored in a memory, the method comprising:
operation processing for operating the electronic equipment;
acquisition processing for acquiring a detection value of a sensor configured to detect the state of the vehicle;
reward calculation processing for providing, based on the detection value acquired through the acquisition processing, a greater reward when a characteristic of the vehicle having correlation with the traveling control modes satisfies a criterion than when the characteristic of the vehicle does not satisfy the criterion; and
update processing for updating the relationship definition data with the state of the vehicle, a value of the action variable used for the operation of the electronic equipment, and the reward given according to the operation based on the detection value acquired through the acquisition processing as inputs to update mapping determined in advance, wherein:
the processor is configured to, based on the update mapping, output the relationship definition data updated to increase an expected return on the reward when the electronic equipment is operated in compliance with the relationship definition data;
the reward calculation processing includes processing for providing a reward such that the reward provided when the selected traveling control mode is a first traveling control mode is different from the reward provided when the selected traveling control mode is a second traveling control mode even though the characteristic of the vehicle satisfies the same criterion, the first traveling control mode being different from the second traveling control mode;
a change in accelerator operation amount is included in the state of the vehicle;
the reward calculation processing includes processing for providing a greater reward when a front-rear direction acceleration of the vehicle accompanied by the change in accelerator operation amount satisfies a criterion than when the acceleration does not satisfy the criterion, and providing different rewards between the first traveling control mode and the second traveling control mode among the traveling control modes even though the acceleration satisfies the same criterion;
the vehicle includes an internal combustion engine as a thrust generation device of the vehicle;
a throttle valve of the internal combustion engine is included in the electronic equipment; and
a variable relating to an opening degree of the throttle valve is included in the action variable.