CPC G06T 17/05 (2013.01) [G06N 3/04 (2013.01); G06N 3/08 (2013.01); G06T 5/002 (2013.01); G06T 5/003 (2013.01); G06T 7/55 (2017.01); G06T 2200/08 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30252 (2013.01)] | 20 Claims |
1. A method for three-dimensional (3D) scene reconstruction by an agent, comprising:
estimating an ego-motion of the agent based on a current image from a sequence of images and a previous image from the sequence of images, each image in the sequence of images being a two-dimensional (2D) image;
estimating a per-pixel depth of the current image via a depth estimation model, the depth estimation model including a plurality of encoder layers and a plurality of decoder layers;
generating a 3D reconstruction of the current image based on the estimated ego-motion and the estimated per-pixel depth; and
controlling an action of the agent based on the 3D reconstruction.
|