US 12,008,743 B2
Hazard detection ensemble architecture system and method
Carlos Cunha, Mountain View, CA (US); Simon Markus Geisler, Bavaria (DE); and Ravi Kumar Satzoda, Sunnyvale, CA (US)
Assigned to Robert Bosch GmbH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on May 22, 2020, as Appl. No. 16/881,581.
Prior Publication US 2021/0366096 A1, Nov. 25, 2021
Int. Cl. G06V 20/58 (2022.01); G06N 3/0455 (2023.01); G06N 3/08 (2023.01); G06N 20/20 (2019.01); G06T 7/00 (2017.01)
CPC G06T 7/0002 (2013.01) [G06N 3/08 (2013.01); G06N 20/20 (2019.01); G06T 2207/10012 (2013.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 9 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a stereo image having a red channel, a blue channel, and a green channel, wherein the red channel, the blue channel, and the green channel produce a 3-channel RGB image;
receiving a depth image;
concatenating the depth image to the red, green, and blue channels of the 3-channel RGB image to produce a 4-channel RGBD image, and providing the 4-channel RGBD image as a single input into a same layer of a neural network, wherein the neural network includes:
a first plurality of layers, wherein each layer in the first plurality of layers includes:
a convolutional layer;
a batch normal layer;
a leaky ReLU activation function; and
a max pooling layer;
a second plurality of layers, wherein each layer in the second plurality of layers includes:
a convolutional layer;
a batch normal layer; and
a leaky ReLU activation function, and wherein the second plurality of layers does not include a max pooling layer;
a third plurality of layers, wherein each layer in the third plurality of layers includes:
a convolutional layer;
a batch normal layer;
a leaky ReLU activation function; and
a you-only-look-once (“YOLO”) layer; and
one or more skip architectures;
determining one or more hazards within the stereo image using the neural network;
concatenating an output of the neural network with an output of an auxiliary semantic segmentation decoder to force the neural network to learn one or more features for a drivable space to determine the one or more hazards within the stereo image; and
driving a vehicle through the drivable space while avoiding the one or more hazards.