CPC G06T 5/77 (2024.01) [G06N 3/04 (2013.01); G06T 7/136 (2017.01); G06V 10/757 (2022.01)] | 20 Claims |
1. A method comprising:
sending an input image to a backbone network;
generating, via the backbone network, one or more image feature outputs;
sending the one or more image feature outputs to a spatial attention module for generating a feature map associated with one or more objects in the input image;
sending the feature map to a category feature module for generating an instance category output indicating the one or more objects;
sending the one or more image feature outputs directly from the backbone network to a mask generating module for generating one or more masks, each associated with an object in the input image; and
generating:
the instance category output via the category feature module; and
the one or more masks via the mask generating module.
|