US 12,080,310 B2
	Audio encoder and decoder using a frequency domain processor with full-band gap filling and a time domain processor
Sascha Disch, Fürth (DE); Martin Dietz, Nuremberg (DE); Markus Multrus, Nuremberg (DE); Guillaume Fuchs, Bubenreuth (DE); Emmanuel Ravelli, Erlangen (DE); Matthias Neusinger, Rohr (DE); Markus Schnell, Nuremberg (DE); Benjamin Schubert, Nuremberg (DE); and Bernhard Grill, Rückersdorf (DE)
Assigned to Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V., (DE)
Filed by Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V., Munich (DE)
Filed on Jun. 1, 2021, as Appl. No. 17/336,132.
Application 17/336,132 is a continuation of application No. 16/286,397, filed on Feb. 26, 2019, granted, now 11,049,508.
Application 16/286,397 is a continuation of application No. 15/414,427, filed on Jan. 24, 2017, granted, now 10,332,535, issued on Jun. 25, 2019.
Application 15/414,427 is a continuation of application No. PCT/EP2015/067003, filed on Jul. 24, 2015.
Claims priority of application No. 14178817 (EP), filed on Jul. 28, 2014.
Prior Publication US 2021/0287689 A1, Sep. 16, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 19/18 (2013.01); G10L 19/02 (2013.01); G10L 19/028 (2013.01); G10L 19/032 (2013.01); G10L 19/04 (2013.01); G10L 19/06 (2013.01); G10L 19/20 (2013.01); G10L 19/24 (2013.01); G10L 19/26 (2013.01); G10L 21/038 (2013.01)

CPC G10L 19/18 (2013.01) [G10L 19/028 (2013.01); G10L 19/032 (2013.01); G10L 19/06 (2013.01); G10L 19/265 (2013.01); G10L 19/02 (2013.01); G10L 19/04 (2013.01); G10L 19/20 (2013.01); G10L 19/24 (2013.01); G10L 21/038 (2013.01)]

18 Claims

1. An audio encoder for encoding an audio signal, the audio signal comprising a first audio signal portion and a timely subsequent second audio signal portion having an audio sampling rate, to generate an encoded audio signal, comprising:

a first encoding processor for encoding the first audio signal portion in a frequency domain to obtain a first encoded signal portion;

a second encoding processor for encoding the second audio signal portion in a time domain to obtain a second encoded signal portion, the second audio signal portion comprising a low band and a high band, wherein the second encoding processor comprises:

a sampling rate converter for converting the second audio signal portion to a lower sampling rate representation of the second audio signal portion, wherein the sampling rate converter is configured so that a lower sampling rate of the lower sampling rate representation is lower than the audio sampling rate of the second audio signal portion, and so that the lower sampling rate representation of the second audio signal portion comprises the low band of the second audio signal portion and does not comprise the high band of the second audio signal portion;

a time domain low band encoder for time domain encoding the lower sampling rate representation of the second audio signal portion; and

a time domain bandwidth extension encoder for parametrically encoding the high band of the second audio signal portion;

a controller configured for analyzing a portion of the audio signal and for determining, that the portion of the audio signal is either the first audio signal portion encoded in the frequency domain or the second audio signal portion encoded in the time domain; and

an encoded signal former for forming the encoded audio signal comprising the first encoded signal portion for the first audio signal portion and the second encoded signal portion for the second audio signal portion,

wherein the audio encoder comprises a cross-processor for calculating, from an encoded spectral representation of the first audio signal portion, initialization data of the second encoding processor, so that the second encoding processor is initialized to encode the second audio signal portion immediately following the first audio signal portion in time in the audio signal, wherein the cross-processor comprises: a spectral decoder for calculating a decoded version of the first encoded signal portion; a delay stage for feeding a delayed version of the decoded version into a de-emphasis stage of the second encoding processor for initialization; a weighted prediction coefficient analysis filtering block for filtering and feeding a filter output into a codebook determinator of the second encoding processor for initialization; an analysis filtering stage for filtering the decoded version or a pre-emphasized version and for feeding a filter residual into an adaptive codebook determinator of the second encoding processor for initialization; or a pre-emphasis filter for filtering the decoded version and for feeding a delayed or pre-emphasized version to a synthesis filtering stage of the second encoding processor for initialization,

wherein the first encoding processor comprises: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second spectral portions to be encoded with a second spectral resolution, the second spectral resolution being lower than the first spectral resolution, wherein the analyzer is configured to determine a first spectral portion from the first spectral portions, the first spectral portion being placed, with respect to frequency, between two second spectral portions from the second spectral portions; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and for encoding the second spectral portions with the second spectral resolution, wherein the spectral encoder comprises a parametric coder for calculating spectral envelope information comprising the second spectral resolution from the second spectral portions.