US 11,886,494 B2
Utilizing natural language processing automatically select objects in images
Walter Wei Tuh Chang, San Jose, CA (US); Khoi Pham, Hyattsville, MD (US); Scott Cohen, Sunnyvale, CA (US); Zhe Lin, Fremont, CA (US); and Zhihong Ding, Fremont, CA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Sep. 1, 2022, as Appl. No. 17/929,206.
Application 17/929,206 is a continuation of application No. 16/800,415, filed on Feb. 25, 2020, granted, now 11,468,110.
Prior Publication US 2022/0414142 A1, Dec. 29, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/583 (2019.01); G06F 16/532 (2019.01); G06F 16/33 (2019.01); G06T 11/60 (2006.01); G06F 40/279 (2020.01); G06F 40/247 (2020.01); G06N 20/00 (2019.01); G06F 16/242 (2019.01); G06F 16/28 (2019.01); G06F 16/538 (2019.01); G06F 40/30 (2020.01); G06F 18/2431 (2023.01); G06V 10/82 (2022.01)
CPC G06F 16/5854 (2019.01) [G06F 16/243 (2019.01); G06F 16/288 (2019.01); G06F 16/3344 (2019.01); G06F 16/532 (2019.01); G06F 16/538 (2019.01); G06F 18/2431 (2023.01); G06F 40/247 (2020.01); G06F 40/279 (2020.01); G06F 40/30 (2020.01); G06N 20/00 (2019.01); G06T 11/60 (2013.01); G06V 10/82 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium storing executable instructions that, when executed by a processing device, cause the processing device to perform operations comprising:
receiving a query string to select a query object in a digital image;
analyzing the query string to identify a plurality of object terms and at least one relationship term linking the plurality of object terms;
identifying a plurality of object classes corresponding to the plurality of object terms;
generating object masks for objects of the plurality of object classes utilizing one or more object detection models;
analyzing the object masks generated for the objects of the plurality of object classes to identify an object mask that satisfies a relationship type defined by the at least one relationship term by determining at least one of a spatial relationship, a proximity relationship, a depth relationship, a relative position relationship, an absolute position relationship, or an exclusion relationship between two or more of the object masks; and
providing the digital image with the query object selected by providing the object mask in response to receiving the query string.