CPC G06T 7/0002 (2013.01) [G06N 3/08 (2013.01); G06N 20/20 (2019.01); G06T 2207/10012 (2013.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] | 9 Claims |
1. A method comprising:
receiving a stereo image having a red channel, a blue channel, and a green channel, wherein the red channel, the blue channel, and the green channel produce a 3-channel RGB image;
receiving a depth image;
concatenating the depth image to the red, green, and blue channels of the 3-channel RGB image to produce a 4-channel RGBD image, and providing the 4-channel RGBD image as a single input into a same layer of a neural network, wherein the neural network includes:
a first plurality of layers, wherein each layer in the first plurality of layers includes:
a convolutional layer;
a batch normal layer;
a leaky ReLU activation function; and
a max pooling layer;
a second plurality of layers, wherein each layer in the second plurality of layers includes:
a convolutional layer;
a batch normal layer; and
a leaky ReLU activation function, and wherein the second plurality of layers does not include a max pooling layer;
a third plurality of layers, wherein each layer in the third plurality of layers includes:
a convolutional layer;
a batch normal layer;
a leaky ReLU activation function; and
a you-only-look-once (“YOLO”) layer; and
one or more skip architectures;
determining one or more hazards within the stereo image using the neural network;
concatenating an output of the neural network with an output of an auxiliary semantic segmentation decoder to force the neural network to learn one or more features for a drivable space to determine the one or more hazards within the stereo image; and
driving a vehicle through the drivable space while avoiding the one or more hazards.
|