US 12,283,281 B2
Bitrate distribution in immersive voice and audio services
Rishabh Tyagi, Sydney (AU); Juan Felix Torres, Darlinghurst (AU); and Stefanie Brown, Lewisham (AU)
Assigned to Dolby Laboratories Licensing Corporation, San Francisco, CA (US)
Appl. No. 17/772,497
Filed by Dolby Laboratories Licensing Corporation, San Francisco, CA (US)
PCT Filed Oct. 28, 2020, PCT No. PCT/US2020/057737
§ 371(c)(1), (2) Date Apr. 27, 2022,
PCT Pub. No. WO2021/086965, PCT Pub. Date May 6, 2021.
Claims priority of provisional application 63/092,830, filed on Oct. 16, 2020.
Claims priority of provisional application 62/927,772, filed on Oct. 30, 2019.
Prior Publication US 2022/0406318 A1, Dec. 22, 2022
Int. Cl. G10L 19/008 (2013.01); G10L 19/032 (2013.01); G10L 19/16 (2013.01)
CPC G10L 19/032 (2013.01) [G10L 19/008 (2013.01); G10L 19/167 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A method of encoding an immersive voice and audio services (IVAS) bitstream, the method comprising:
receiving, using one or more processors, an input audio signal;
downmixing, using the one or more processors, the input audio signal into one or more downmix channels and spatial metadata associated with one or more channels of the input audio signal;
obtaining using the one or more processors, a set of one or more target bitrates for the one or more downmix channels and a set of metadata quantization levels for the spatial metadata from a bitrate distribution control table;
determining, using the one or more processors, a combination of the one or more target bitrates for the one or more downmix channels;
determining, using the one or more processors, a metadata quantization level from the set of metadata quantization levels using a bitrate distribution process, wherein the bitrate distribution process adjusts at least one of the target bitrates or at least one of the metadata quantization levels of the spatial metadata based at least in part on a bitrate budget for the IVAS bitstream;
quantizing and coding, using the one or more processors, the spatial metadata using the metadata quantization level;
generating, using the one or more processors and the combination of one or more target bitrates, a downmix bitstream for the one or more downmix channels;
combining, using the one or more processors, the downmix bitstream, the quantized and coded spatial metadata and the coded set of metadata quantization levels into the IVAS bitstream; and
outputting, streaming or storing the IVAS bitstream for playback on an IVAS- enabled device.