US 11,816,442 B2
	Multi-turn dialogue response generation with autoregressive transformer models
Oluwatobi Olabiyi, Arlington, VA (US); Erik T. Mueller, Chevy Chase, MD (US); and Rui Zhang, McLean, VA (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Mar. 1, 2023, as Appl. No. 18/115,864.
Application 18/115,864 is a continuation of application No. 16/935,584, filed on Jul. 22, 2020, granted, now 11,615,255.
Claims priority of provisional application 62/877,076, filed on Jul. 22, 2019.
Prior Publication US 2023/0206005 A1, Jun. 29, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/30 (2020.01); G06F 40/56 (2020.01); G06F 40/35 (2020.01); G06N 3/049 (2023.01); G10L 15/22 (2006.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G06N 20/00 (2019.01); G06F 40/284 (2020.01); G06F 18/21 (2023.01); G06F 18/214 (2023.01)

CPC G06F 40/30 (2020.01) [G06F 18/217 (2023.01); G06F 18/2148 (2023.01); G06F 40/284 (2020.01); G06F 40/35 (2020.01); G06F 40/56 (2020.01); G06N 3/049 (2013.01); G06N 20/00 (2019.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/0631 (2013.01); G10L 2015/228 (2013.01)]

20 Claims

1. A computer-implemented method, comprising:

initializing a model having a sequence to sequence network architecture comprising an encoder and a decoder;

training the model using a plurality of training sequences, wherein each training sequence comprises an encoder sequence and a decoder sequence, and wherein training the model comprises:

generating an encoding for each training sequence in the plurality of training sequences; and

for each encoding:

randomly inserting an informative padding, comprising a random sampling of encoded tokens from the plurality of training sequences, into the encoder sequence of the encoding; and

training the model using the encoder sequence and the decoder sequence; and

generating, using the trained model, a prediction based on an input data set.