CPC G06N 3/084 (2013.01) [B25J 9/163 (2013.01); B25J 9/1697 (2013.01); G05B 13/027 (2013.01); G06V 10/70 (2022.01); G06V 10/82 (2022.01); G06V 20/52 (2022.01); H04N 7/181 (2013.01)] | 18 Claims |
1. A method of training a neural network having a plurality of network parameters, wherein the neural network is configured to receive an input observation characterizing a state of an environment and to process the input observation to generate a numeric embedding of the state of the environment, the method comprising:
obtaining a first observation captured by a first modality;
obtaining a second observation that is co-occurring with the first observation and that is captured by a second, different modality;
obtaining a third observation captured by the first modality that is not co-occurring with the first observation;
determining a gradient of a triplet loss that uses the first observation as an anchor example, the second observation as a positive example, and the third observation as a negative example; and
updating current values of the network parameters using the gradient of the triplet loss,
wherein the observations are images related to a same environment, wherein the first modality is a camera at a first viewpoint, and wherein the second modality is another camera at a second, different viewpoint.
|