US 12,254,414 B2
Autoencoding generative adversarial network for augmenting training data usable to train predictive models
Mårten Nilsson, Danderyd (SE)
Assigned to Tobii AB, Danderyd (SE)
Appl. No. 17/056,272
Filed by Tobii AB, Danderyd (SE)
PCT Filed May 13, 2019, PCT No. PCT/SE2019/050420
§ 371(c)(1), (2) Date Nov. 17, 2020,
PCT Pub. No. WO2019/221654, PCT Pub. Date Nov. 21, 2019.
Claims priority of provisional application 62/672,985, filed on May 17, 2018.
Prior Publication US 2021/0256353 A1, Aug. 19, 2021
Int. Cl. G06N 3/04 (2023.01); G06F 18/214 (2023.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01); G06N 3/084 (2023.01); G06N 3/088 (2023.01)
CPC G06N 3/088 (2013.01) [G06F 18/214 (2023.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01); G06N 3/084 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A method for gaze prediction, the method being implemented on a computer system and comprising:
providing, to a gaze prediction model, a user image that shows at least a user eye; and
receiving, from the gaze prediction model, a prediction of a user gaze based on the user image,
wherein:
the gaze prediction model is trained based on an augmented training image that is generated by a generator network,
the generator network is trained to generate the augmented training image based on a training of an autoencoder network and a generative adversarial network,
the autoencoder network comprises the generator network and an encoder network,
the generative adversarial network comprises the generator network and a discriminator network,
a loss function for training the generator network comprises a first loss term associated with training the encoder network and a second loss term associated with training the generative adversarial network, and
the training of the autoencoder network and the generative adversarial network comprises:
providing, to the encoder network, a training image;
mapping, to a latent space, a code vector that is generated by the encoder network based on the training image;
computing a loss of the encoder network based on a comparison of the training image and a reconstructed image, wherein the reconstructed image is generated by the generator network based on the code vector;
updating parameters of the encoder network based on the loss of the encoder network;
providing, to the discriminator network, the training image and a fake image, wherein the fake image is generated by the generator network based on the latent space;
computing a loss of the discriminator network based on predictions of the discriminator network whether each of the training image and the fake image is real or fake; and
updating parameters of the discriminator network based on the loss of the discriminator network.