CPC G06N 3/088 (2013.01) [G06F 40/20 (2020.01); G06N 3/08 (2013.01); G06N 3/086 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G06N 3/02 (2013.01); G06N 3/082 (2013.01)] | 18 Claims |
1. A computer-implemented method for disentangled data generation, comprising:
accessing a dataset including a plurality of pairs, each formed from a given one of a plurality of input text structures and a given one of a plurality of style labels for the plurality of input text structures;
training an encoder neural network to disentangle a sequential text input into disentangled representations, including a content embedding and a style embedding, based on a subset of the dataset, using an objective function that includes a regularization term that minimizes mutual information between the content embedding and the style embedding, wherein the objective function is:
VAE+λreg
wherein λ is a hyperparameter reweighting a regularization reg and a variational autoencoder objective VAE, where reg is expressed as Dis+MI(s; c), including a disentanglement loss Dis and a mutual information term MI(s; c) based on a style embedding s and a content embedding c; and
training a generator neural network to generate a text output that includes content from the style embedding, expressed in a style other than that represented by the style embedding of the text input.
|