| CPC G06T 11/60 (2013.01) [G06F 40/40 (2020.01); G06T 7/11 (2017.01); G06T 7/60 (2013.01); G06T 7/75 (2017.01); G06T 2207/20081 (2013.01)] | 20 Claims |

|
1. A method comprising:
accessing a corpus of images and associated text expressing first spatial relationships between objects in the images;
using a machine-trained objection detection module:
detecting the objects in respective images of the corpus; and
determining respective locations of the detected objects in the respective images;
based at least on the respective locations of the detected objects, determining second spatial relationships between the detected objects in the respective images; and
cleansing the corpus by:
removing individual images from the corpus having first spatial relationships expressed by the text that do not match corresponding second spatial relationships determined from the respective locations of the detected objects; and
retaining, in the corpus, other images having first spatial relationships expressed by the text that match corresponding second spatial relationships determined from the respective locations of the detected objects.
|