US 11,887,008 B2
Contextual text generation for question answering and text summarization with supervised representation disentanglement and mutual information minimization
Renqiang Min, Princeton, NJ (US); Christopher Malon, Fort Lee, NJ (US); and Hans Peter Graf, South Amboy, NJ (US)
Assigned to NEC Corporation, Tokyo (JP)
Filed by NEC Laboratories America, Inc., Princeton, NJ (US)
Filed on Dec. 8, 2020, as Appl. No. 17/114,946.
Claims priority of provisional application 62/945,270, filed on Dec. 9, 2019.
Claims priority of provisional application 62/945,274, filed on Dec. 9, 2019.
Prior Publication US 2021/0174784 A1, Jun. 10, 2021
Int. Cl. G06N 3/08 (2023.01); G06N 3/088 (2023.01); G06F 40/20 (2020.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01); G06N 3/086 (2023.01); G06N 3/02 (2006.01); G06N 3/082 (2023.01)
CPC G06N 3/088 (2013.01) [G06F 40/20 (2020.01); G06N 3/08 (2013.01); G06N 3/086 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G06N 3/02 (2013.01); G06N 3/082 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method for disentangled data generation, comprising:
accessing a dataset including a plurality of pairs, each formed from a given one of a plurality of input text structures and a given one of a plurality of style labels for the plurality of input text structures;
training an encoder neural network to disentangle a sequential text input into disentangled representations, including a content embedding and a style embedding, based on a subset of the dataset, using an objective function that includes a regularization term that minimizes mutual information between the content embedding and the style embedding, wherein the objective function is:
custom characterVAEcustom characterreg
wherein λ is a hyperparameter reweighting a regularization custom characterreg and a variational autoencoder objective custom characterVAE, where custom characterreg is expressed as custom characterDis+MI(s; c), including a disentanglement loss custom characterDis and a mutual information term MI(s; c) based on a style embedding s and a content embedding c; and
training a generator neural network to generate a text output that includes content from the style embedding, expressed in a style other than that represented by the style embedding of the text input.