US 11,854,560 B2
	Audio scene encoder, audio scene decoder and related methods using hybrid encoder-decoder spatial analysis
Guillaume Fuchs, Erlangen (DE); Stefan Bayer, Erlangen (DE); Markus Multrus, Erlangen (DE); Oliver Thiergart, Erlangen (DE); Alexandre Bouthéon, Erlangen (DE); Jürgen Herre, Erlangen (DE); Florin Ghido, Erlangen (DE); Wolfgang Jaegers, Forchheim (DE); and Fabian Küch, Erlangen (DE)
Assigned to Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Filed by Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V., Munich (DE)
Filed on Dec. 20, 2021, as Appl. No. 17/645,110.
Application 17/645,110 is a continuation of application No. 16/943,065, filed on Jul. 30, 2020, granted, now 11,361,778.
Application 16/943,065 is a continuation of application No. PCT/EP2019/052428, filed on Jan. 31, 2019.
Claims priority of application No. 18154749 (EP), filed on Feb. 1, 2018; and application No. 18185852 (EP), filed on Jul. 26, 2018.
Prior Publication US 2022/0139409 A1, May 5, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 19/032 (2013.01); G10L 19/008 (2013.01); H04R 3/00 (2006.01); H04R 3/04 (2006.01); H04R 3/12 (2006.01); H04R 5/04 (2006.01); H04S 7/00 (2006.01)

CPC G10L 19/032 (2013.01) [G10L 19/008 (2013.01); H04R 3/005 (2013.01); H04R 3/04 (2013.01); H04R 3/12 (2013.01); H04R 5/04 (2013.01); H04S 7/307 (2013.01); H04S 2420/11 (2013.01)]

16 Claims

1. Audio scene encoder for encoding an audio scene, the audio scene comprising at least two component signals, the audio scene encoder comprising:

a core encoder for core encoding the at least two component signals, wherein the core encoder is configured to generate a first encoded representation for a first portion of the at least two component signals, and to generate a second encoded representation for a second portion of the at least two component signals;

a spatial analyzer for analyzing the audio scene comprising the at least two component signals to derive one or more spatial parameters or one or more spatial parameter sets for the second portion of the at least two component signals; and

an output interface for forming an encoded audio scene signal, the encoded audio scene signal comprising the first encoded representation for the first portion of the at least two component signals, the second encoded representation for the second portion of the at least two component signals, and the one or more spatial parameters or the one or more spatial parameter sets for the second portion of the at least two component signals,

wherein the core encoder is configured to generate the first encoded representation with a first frequency resolution and to generate the second encoded representation with a second frequency resolution, the second frequency resolution being lower than the first frequency resolution, from subsequent time frames from the at least two component signals, wherein a first time frame of the subsequent time frames is the first portion of the at least two component signals and a second time frame of the subsequent time frames is the second portion of the at least two component signals, or

wherein a border frequency between a first frequency subband of a time frame and a second frequency subband of the time frame coincides with a border between a scale factor band and an adjacent scale factor band or does not coincide with a border between the scale factor band and the adjacent scale factor band, wherein the scale factor band and the adjacent scale factor band are used by the core encoder, wherein the first frequency subband of the time frame is the first portion of the at least two component signals and the second frequency subband of the time frame is the second portion of the at least two component signals.