| CPC G06V 10/774 (2022.01) [G06T 15/20 (2013.01); G06V 10/764 (2022.01); G06V 10/776 (2022.01); G06V 20/70 (2022.01)] | 12 Claims |

|
1. A computer-implemented method for building an object detection module, comprising:
obtaining mesh representations of objects belonging to specified object classes of interest,
rendering a plurality of images by a physics-based simulator using the mesh representations of the objects, wherein each rendered image captures a simulated environment containing objects belonging to multiple of said object classes of interest placed in a bin or on a table, wherein the plurality of rendered images are generated by randomizing a set of parameters by the simulator to render a range of simulated environments, the set of parameters including environmental and sensor-based parameters,
generating a label for each rendered image, the label including a two-dimensional representation indicative of location and object classes of objects in the respective rendered image frame, wherein each rendered image and the respective label constitute a data sample of a synthetic training dataset,
training a deep learning model using the synthetic training dataset to output object classes from an input image of a real-world physical environment,
deploying the trained deep learning model for testing on a set of real-world test images to generate an inference output for each test image,
adjusting the training dataset based on a success of the generated inference outputs,
identifying a “failure” image from the set of test images for which the generated inference output does not meet a defined success criterion,
feeding the “failure” image to the simulator to render additional images by randomizing the set of parameters around an environmental or sensor-based setting that corresponds to the “failure” image and to generate a respective label for each additional rendered image, and
retraining the deep learning model using the rendered additional images and the respective labels.
|