| CPC G06T 11/001 (2013.01) [G06F 17/18 (2013.01); G06F 18/22 (2023.01); G06F 18/253 (2023.01); G06T 5/50 (2013.01); G06V 10/74 (2022.01); G06V 10/806 (2022.01); G06V 20/647 (2022.01); G06V 40/103 (2022.01); G06T 2207/20084 (2013.01)] | 20 Claims |

|
1. A method comprising:
receiving a two-dimensional (2D) input image depicting a person in a first pose;
receiving a target image corresponding to a second pose;
processing the 2D input image to generate soft feature pooling features, the soft feature pooling features comprising a matrix of a collection of features extracted from the 2D input image that are arranged based on expected locations of different joints;
processing the target image to generate target soft intrinsic distances, the target soft intrinsic distances representing a likelihood that pixel values in the 2D input image correspond to a particular joint in the target image; and
decoding features of the 2D input image based on a combination of the soft feature pooling features and target soft intrinsic distances to generate a decoded image that depicts the person in the second pose.
|