US 12,008,743 B2
	Hazard detection ensemble architecture system and method
Carlos Cunha, Mountain View, CA (US); Simon Markus Geisler, Bavaria (DE); and Ravi Kumar Satzoda, Sunnyvale, CA (US)
Assigned to Robert Bosch GmbH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on May 22, 2020, as Appl. No. 16/881,581.
Prior Publication US 2021/0366096 A1, Nov. 25, 2021
Int. Cl. G06V 20/58 (2022.01); G06N 3/0455 (2023.01); G06N 3/08 (2023.01); G06N 20/20 (2019.01); G06T 7/00 (2017.01)

CPC G06T 7/0002 (2013.01) [G06N 3/08 (2013.01); G06N 20/20 (2019.01); G06T 2207/10012 (2013.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)]

9 Claims

1. A method comprising:

receiving a stereo image having a red channel, a blue channel, and a green channel, wherein the red channel, the blue channel, and the green channel produce a 3-channel RGB image;

receiving a depth image;

concatenating the depth image to the red, green, and blue channels of the 3-channel RGB image to produce a 4-channel RGBD image, and providing the 4-channel RGBD image as a single input into a same layer of a neural network, wherein the neural network includes:

a first plurality of layers, wherein each layer in the first plurality of layers includes:

a convolutional layer;

a batch normal layer;

a leaky ReLU activation function; and

a max pooling layer;

a second plurality of layers, wherein each layer in the second plurality of layers includes:

a convolutional layer;

a batch normal layer; and

a leaky ReLU activation function, and wherein the second plurality of layers does not include a max pooling layer;

a third plurality of layers, wherein each layer in the third plurality of layers includes:

a convolutional layer;

a batch normal layer;

a leaky ReLU activation function; and

a you-only-look-once (“YOLO”) layer; and

one or more skip architectures;

determining one or more hazards within the stereo image using the neural network;

concatenating an output of the neural network with an output of an auxiliary semantic segmentation decoder to force the neural network to learn one or more features for a drivable space to determine the one or more hazards within the stereo image; and

driving a vehicle through the drivable space while avoiding the one or more hazards.