US 11,798,196 B2
	Video-based point cloud compression with predicted patches
Jungsun Kim, San Jose, CA (US); Khaled Mammou, Vancouver (CA); and Alexandros Tourapis, Los Gatos, CA (US)
Assigned to Apple Inc., Cupertino, CA (US)
Filed by Apple Inc., Cupertino, CA (US)
Filed on Jan. 8, 2021, as Appl. No. 17/145,205.
Claims priority of provisional application 62/958,661, filed on Jan. 8, 2020.
Prior Publication US 2021/0217203 A1, Jul. 15, 2021
Int. Cl. G06T 9/00 (2006.01); G06T 17/20 (2006.01); G06T 3/40 (2006.01)

CPC G06T 9/001 (2013.01) [G06T 3/40 (2013.01); G06T 17/20 (2013.01); G06T 2210/12 (2013.01)]

20 Claims

1. A non-transitory, computer-readable, medium storing program instructions that, when executed by one or more processors, cause the one or more processors to:

determine, for three-dimensional (3D) visual volumetric content, a plurality of patches corresponding to portions of the 3D visual volumetric content;

generate, for respective patches of the plurality of patches, respective patch images comprising sets of points or vertices of the 3D visual volumetric content that correspond to the respective patches when the portions of the 3D visual volumetric content are projected onto respective patch planes for the patches;

pack the generated patch images into one or more two-dimensional (2D) image frames that are to be encoded to communicate a compressed version of the 3D visual volumetric content;

generate auxiliary information for the compressed version of the 3D visual volumetric content, the auxiliary information indicating for respective ones of the one or more 2D image frames:

respective sizes of bounding boxes for the patch images and respective locations of the bounding boxes in the 2D image frame;

respective locations or characteristics of the patches in a 3D reconstructed version of the 3D visual volumetric content; and

one or more indications of:

one or more predicted patches in the 2D image frame, wherein a respective one of the one or more predicted patches is indicated by referencing a corresponding reference patch in the same 2D image frame and signaling a plurality of residual values for the respective one of the predicted patches, wherein the plurality of residual values are relative to corresponding values of the reference patch; or

one or more copied patches in the 2D image frame, wherein the one or more copied patches reference a corresponding reference patch in the same 2D image frame and are signaled without repeating information that is being copied from the reference patch; and

encode the 2D image frame and the auxiliary information to generate the compressed version of the 3D visual volumetric content.