US 12,444,398 B1
	Manifold learning for sound field estimation
Karim Helwani, San Mateo, CA (US); Michael Mark Goodwin, Scotts Valley, CA (US); and Paris Smaragdis, Urbana, IL (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Sep. 27, 2023, as Appl. No. 18/476,197.
Int. Cl. G10K 11/178 (2006.01)

CPC G10K 11/17823 (2018.01) [G10K 11/17873 (2018.01); G10K 2210/12 (2013.01); G10K 2210/3027 (2013.01); G10K 2210/3028 (2013.01); G10K 2210/3035 (2013.01); G10K 2210/3038 (2013.01); G10K 2210/505 (2013.01)]

20 Claims

1. A computer-implemented method for estimating a sound field for virtual reality or augmented reality, comprising:

receiving, for a first position associated with a near end room, room data comprising (i) input audio data and (ii) target audio data;

generating measurement vector data from the input audio data;

generating initial input vector data for a second position associated with the near end room;

generating input data from (i) the measurement vector data, (ii) the first position, (iii) the initial input vector data, and (iv) the second position;

applying initial filter parameters to the input data that results in filtered data;

generating target data from the target audio data;

determining an estimated loss from the filtered data and the target data;

determining a matrix from a decoder of a trained variational autoencoder;

combining the matrix, the estimated loss, and a step value that results in a tangent vector;

applying the decoder to a point in a tangent space indicated by the tangent vector, wherein the decoder outputs updated filter parameters;

receiving far end audio data;

in response to receiving the far end audio data, substantially in real-time:

generating near end audio data from (i) the far end audio data, (ii) the updated filter parameters, and (iii) the second position; and

outputting the near end audio data.