CPC G06F 40/30 (2020.01) [G06F 18/217 (2023.01); G06F 18/2148 (2023.01); G06F 40/284 (2020.01); G06F 40/35 (2020.01); G06F 40/56 (2020.01); G06N 3/049 (2013.01); G06N 20/00 (2019.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/0631 (2013.01); G10L 2015/228 (2013.01)] | 20 Claims |
1. A computer-implemented method, comprising:
initializing a model having a sequence to sequence network architecture comprising an encoder and a decoder;
training the model using a plurality of training sequences, wherein each training sequence comprises an encoder sequence and a decoder sequence, and wherein training the model comprises:
generating an encoding for each training sequence in the plurality of training sequences; and
for each encoding:
randomly inserting an informative padding, comprising a random sampling of encoded tokens from the plurality of training sequences, into the encoder sequence of the encoding; and
training the model using the encoder sequence and the decoder sequence; and
generating, using the trained model, a prediction based on an input data set.
|