US 12,293,472 B2
	Systems and methods for masking a recognized object during an application of a synthetic element to an original image
Maximilian H. Allan, San Francisco, CA (US); Mahdi Azizian, San Jose, CA (US); and A. Jonathan McLeod, Sunnyvale, CA (US)
Assigned to Intuitive Surgical Operations, Inc., Sunnyvale, CA (US)
Appl. No. 17/793,595
Filed by Intuitive Surgical Operations, Inc., Sunnyvale, CA (US)
PCT Filed Jan. 18, 2021, PCT No. PCT/US2021/013826 § 371(c)(1), (2) Date Jul. 18, 2022, PCT Pub. No. WO2021/150459, PCT Pub. Date Jul. 29, 2021.
Claims priority of provisional application 62/963,249, filed on Jan. 20, 2020.
Prior Publication US 2023/0050857 A1, Feb. 16, 2023
Int. Cl. G06T 19/00 (2011.01); G06T 7/20 (2017.01); G06T 7/50 (2017.01)

CPC G06T 19/006 (2013.01) [G06T 7/20 (2013.01); G06T 7/50 (2017.01); G06T 2207/10016 (2013.01); G06T 2210/41 (2013.01)]

17 Claims

1. A system comprising:

a memory storing instructions; and

a processor communicatively coupled to the memory and configured to execute the instructions to:

access a model of a recognized object depicted in an original image of a scene;

associate the model with the recognized object; and

generate presentation data for use by a presentation system to present an augmented version of the original image in which a synthetic element added to the original image is, based on the model as associated with the recognized object, prevented from occluding at least a portion of the recognized object,

wherein the associating of the model with the recognized object includes:

generating a depth map of imagery depicted by the original image, the depth map including first depth data for a depiction of the recognized object within the imagery and second depth data for a remainder of the imagery, the first depth data based on the model of the recognized object and denser than the second depth data; and

segmenting the original image to distinguish pixels of the original image that depict the recognized object from pixels of the original image that do not depict the recognized object by

identifying the pixels of the original image that depict the recognized object based on the first depth data; and

identifying the pixels of the original image that do not depict the recognized object based on the second depth data.