US 12,334,055 B2
	Stochastic future context for speech processing
Kwangyoun Kim, Santa Clara, CA (US); Felix Wu, Ithaca, NY (US); Prashant Sridhar, New York, NY (US); and Kyu Jeong Han, Pleasanton, CA (US)
Assigned to ASAPP, INC., New York, NY (US)
Filed by ASAPP, INC., New York, NY (US)
Filed on Nov. 18, 2021, as Appl. No. 17/530,139.
Claims priority of provisional application 63/170,172, filed on Apr. 2, 2021.
Prior Publication US 2022/0319501 A1, Oct. 6, 2022
Int. Cl. G10L 15/16 (2006.01); G06F 16/34 (2025.01); G06F 40/30 (2020.01); G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01); G10L 15/06 (2013.01); G10L 15/26 (2006.01); H04R 3/00 (2006.01)

CPC G10L 15/16 (2013.01) [G06N 3/045 (2023.01); G10L 15/26 (2013.01)]

20 Claims

1. A computer-implemented method for training a neural network to compute a first output corresponding to a first input, comprising:

obtaining a corpus of training data;

initializing parameters of the neural network; and

training the parameters of the neural network with a plurality of update steps, wherein a first update step comprises:

determining a first future-context size by sampling a probability distribution to choose a value for the first future-context size with a probability defined by a density function of the probability distribution, wherein the first future-context size corresponds to an amount of input that are future to the first input to be used in computing the first output,

masking the neural network using the first future-context size to obtain a first masked neural network,

computing the first output of the neural network by processing a first sample of the training data with the first masked neural network,

computing a first loss value using the first output, and

updating the parameters of the neural network using the first loss value.