CPC G05D 1/0246 (2013.01) [G05D 1/0278 (2013.01); G06V 10/22 (2022.01); G06V 10/764 (2022.01); G06V 10/774 (2022.01); G06V 20/70 (2022.01); G06V 10/82 (2022.01)] | 15 Claims |
1. A processor implemented method for goal-conditioned exploration for object goal navigation, the method comprising:
receiving, via one or more hardware processors of an agent, an image of a goal and a pre-defined number of time instances within which the agent must reach the goal, wherein the goal refers to an object of interest among a plurality of objects in a plurality of regions within an environment of the agent;
initializing, via the one or more hardware processors, the agent at a random location within a region among the plurality of regions;
constructing, via the one or more hardware processors, a spatial occupancy map and a semantic graph corresponding to the random location; and
performing, via the one or more hardware processors, a plurality of steps for (i) each of the pre-defined number of time instances or (ii) until the agent reaches the goal, wherein the plurality of steps comprise:
obtaining a plurality of sensory inputs from a plurality of sensors associated with the agent;
updating the spatial occupancy map and the semantic graph based on the plurality of sensory inputs, wherein the semantic graph comprises one or more objects among the plurality of objects;
predicting a region class probability vector corresponding to each object in the semantic graph using a region classification network, wherein the region class probability vector indicates probability of the object occurring in each of the plurality of regions;
computing a Co-occurrence Likelihood (CL) score for each object in the semantic graph as product of the corresponding region class probability vector and a goal distribution vector, wherein the CL score indicates likelihood of the object and the goal occurring together in a region among the plurality of regions, wherein the goal distribution vector is computed by analyzing distribution of goal in each of the plurality of regions in training scenes, and wherein identified potential sub-goals in the semantic graph have CL score greater than a pre-defined threshold value,
wherein the CL score is calculated
CL score=Σr=0NR-1p(o in region[r])·p(G in region[r]),
wherein region is list of the plurality of regions in the environment, NR is total number of regions, o is object corresponding to which CL score is calculated, and G is the goal, p(o in region[r]) is the region class probability vector corresponding to the object for which CL score is calculated, p(G in region[r]) is the goal distribution vector computed a priori by analyzing distribution of goal G in region[r] based on how many times the goal has occurred in the region,
wherein sum of values of the goal distribution vector is always 1 for each object, and number of values in the goal distribution vector is equal to the number of regions in the environment;
identifying if there are any potential sub-goals in the semantic graph based on the CL score; and
performing one of—(a) navigating the agent towards closest potential sub-goal among the identified potential sub-goals if one or more potential sub-goals are identified, and (b) navigating the agent towards a long-term goal based on the spatial occupancy map, wherein the long-term goal is selected from an exploration policy.
|