| CPC G06T 7/20 (2013.01) [B60W 60/0027 (2020.02); G06V 10/806 (2022.01); H04N 19/43 (2014.11); G06T 2207/10028 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30252 (2013.01)] | 20 Claims |

|
1. A point-cloud motion sensor for estimating motion information of at least some points of an environment, the point-cloud motion sensor comprising:
a depth sensor configured to sense a dynamic environment to collect a temporal sequence of three-dimensional (3D) point clouds of the environment including a current 3D point cloud and a previous 3D point cloud;
a memory configured to store computer executable instruction; and
a processor configured to iteratively process the sequence of 3D point clouds with the neural network, the neural network including:
an encoder providing a spatiotemporal encoding of each point in each of the 3D clouds; and
a decoder decoding the spatiotemporal encodings to generate motion information for each point of each of the 3D clouds,
wherein, to encode a current point of the current 3D point cloud, the encoder is configured to:
extract features of neighboring points in the current 3D point cloud located in proximity to a location of the current point to produce a current spatial encoding of the current point in a current frame;
extract features of neighboring points in the previous 3D point cloud located in proximity to a location in the previous 3D point cloud corresponding to the location of the current point to produce a previous spatial encoding of the current point in the previous frame; and
combine the current spatial encoding and the previous spatial encoding to produce a spatiotemporal encoding of the current point; and
wherein the neural network includes a contractive branch that sequentially downsamples its input and an expansive branch that sequentially upsamples its input, wherein the contractive branch includes one or multiple pairs of the encoder and a downsampling layer, and wherein the expansive branch includes one or multiple pairs of the decoder and an upsampling layer.
|