US 11,741,668 B2
	Template based generation of 3D object meshes from 2D images
Ramsey Jones, Seattle, WA (US); Jared Lafer, Los Angeles, CA (US); and Rebecca Kantar, Truro, MA (US)
Assigned to Roblox Corporation, San Mateo, CA (US)
Filed by Roblox Corporation, San Mateo, CA (US)
Filed on Apr. 23, 2021, as Appl. No. 17/239,380.
Claims priority of provisional application 63/015,391, filed on Apr. 24, 2020.
Prior Publication US 2021/0335039 A1, Oct. 28, 2021
Int. Cl. G06T 17/20 (2006.01); G06T 13/20 (2011.01); G06N 20/00 (2019.01)

CPC G06T 17/20 (2013.01) [G06N 20/00 (2019.01); G06T 13/20 (2013.01)]

17 Claims

1. A computer-implemented method to generate a 3D mesh for an object using a trained machine learning model, the method comprising:

providing a two-dimensional (2D) image of the object as input to the trained machine learning model, wherein the object is associated with a particular category;

determining the particular category of the object based on the 2D image, wherein determining the particular category of the object based on the 2D image comprises:

determining particular portions of the object by performing semantic segmentation of the 2D image based on a set of predetermined descriptors, wherein a number of output classes for the semantic segmentation matches a number of predetermined descriptors in the set; and

determining the particular category of the object based on the determined particular portions of the object;

obtaining a template three-dimensional (3D) mesh associated with the particular category;

generating, using the trained machine learning model and based on the 2D image and the template 3D mesh, a 3D mesh for the object, wherein the 3D mesh for the object is generated by deforming the template 3D mesh by the trained machine learning model, and wherein the 3D mesh for the object is usable to map a texture to the object or to generate a 3D animation of the object; and

processing, using a bidimensional (UV) regressor, the 2D image and the 3D mesh of the object to generate a mapping from vertices of the 3D mesh of the object to the 2D image, wherein the mapping is used to apply the texture to the 3D mesh of the object, and wherein processing the 2D image and the 3D mesh of the object to generate the mapping comprises:

determining a descriptor loss for each side of the 3D mesh of the object;

determining a particular side of the 3D mesh of the object that is associated with the smallest descriptor loss of the determined descriptor losses; and

utilizing the particular side of the 3D mesh of the object that is associated with the smallest descriptor loss as a basis for a UV map for the 3D mesh of the object.