US 11,699,079 B2
	Systems and methods for time series analysis using attention models
Andreas Spanias, Tempe, AZ (US); Huan Song, Tempe, AZ (US); Jayaraman J. Thiagarajan, Milpitas, CA (US); and Deepta Rajan, Bellevue, WA (US)
Assigned to Arizona Board of Regents On Behalf Of Arizona State University, Scottsdale, AZ (US); and Lawrence Livermore National Security. LLC, Livermore, CA (US)
Filed by Andreas Spanias, Tempe, AZ (US); Huan Song, Tempe, AZ (US); Jayaraman J. Thiagarajan, Milpitas, CA (US); and Deepta Rajan, Bellevue, WA (US)
Filed on Jan. 22, 2020, as Appl. No. 16/748,985.
Claims priority of provisional application 62/795,176, filed on Jan. 22, 2019.
Prior Publication US 2020/0236402 A1, Jul. 23, 2020
Int. Cl. G06N 3/082 (2023.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01); G06F 18/213 (2023.01); G06N 3/048 (2023.01); G06F 18/24 (2023.01)

CPC G06N 3/082 (2013.01) [G06F 18/213 (2023.01); G06N 3/04 (2013.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G06F 18/24 (2023.01); G06F 2218/00 (2023.01)]

12 Claims

1. A computer-based method for analyzing and modeling a multivariate time series data based on an attention computation, the method comprising:

capturing dependencies across different variables through input embedding;

mapping an order of a sample appearance to a randomized lookup table via positional encoding;

capturing dependencies within a plurality of self-attention mechanisms, each self-attention mechanism of the plurality of self-attention mechanisms capturing dependencies within a single sequence of each self-attention mechanism;

determining a range of dependency to consider for each position being analyzed within the single sequence of each self-attention mechanism;

obtaining a plurality of attention weightings to other positions within the single sequence through computation of an inner product, each of the plurality of attention weightings obtained within the single sequence of each self-attention mechanism;

utilizing the plurality of attention weightings to acquire a plurality of vector representations for a position;

masking the single sequence of each self-attention mechanism to enable causality;

employing a dense interpolation technique for encoding partial temporal ordering to obtain a single vector representation from the plurality of vector representations;

applying a linear layer to obtain logits from the single vector representation; and

applying a final prediction layer whose type depends on a specific task.