US 11,887,008 B2
	Contextual text generation for question answering and text summarization with supervised representation disentanglement and mutual information minimization
Renqiang Min, Princeton, NJ (US); Christopher Malon, Fort Lee, NJ (US); and Hans Peter Graf, South Amboy, NJ (US)
Assigned to NEC Corporation, Tokyo (JP)
Filed by NEC Laboratories America, Inc., Princeton, NJ (US)
Filed on Dec. 8, 2020, as Appl. No. 17/114,946.
Claims priority of provisional application 62/945,270, filed on Dec. 9, 2019.
Claims priority of provisional application 62/945,274, filed on Dec. 9, 2019.
Prior Publication US 2021/0174784 A1, Jun. 10, 2021
Int. Cl. G06N 3/08 (2023.01); G06N 3/088 (2023.01); G06F 40/20 (2020.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01); G06N 3/086 (2023.01); G06N 3/02 (2006.01); G06N 3/082 (2023.01)

CPC G06N 3/088 (2013.01) [G06F 40/20 (2020.01); G06N 3/08 (2013.01); G06N 3/086 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G06N 3/02 (2013.01); G06N 3/082 (2013.01)]

18 Claims

1. A computer-implemented method for disentangled data generation, comprising:

accessing a dataset including a plurality of pairs, each formed from a given one of a plurality of input text structures and a given one of a plurality of style labels for the plurality of input text structures;

training an encoder neural network to disentangle a sequential text input into disentangled representations, including a content embedding and a style embedding, based on a subset of the dataset, using an objective function that includes a regularization term that minimizes mutual information between the content embedding and the style embedding, wherein the objective function is:

_VAE+λ

_reg

wherein λ is a hyperparameter reweighting a regularization custom character

_regand a variational autoencoder objective custom character

_VAE, where custom character

_regis expressed as custom character

_Dis+MI(s; c), including a disentanglement loss custom character

_Disand a mutual information term MI(s; c) based on a style embedding s and a content embedding c; and

training a generator neural network to generate a text output that includes content from the style embedding, expressed in a style other than that represented by the style embedding of the text input.