US 12,462,817 B2
Three-dimensional audio signal coding method and apparatus, and encoder
Yuan Gao, Beijing (CN); Shuai Liu, Beijing (CN); Bin Wang, Shenzhen (CN); Zhe Wang, Beijing (CN); Tianshu Qu, Beijing (CN); and Jiahao Xu, Beijing (CN)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by HUAWEI TECHNOLOGIES CO., LTD., Guangdong (CN)
Filed on Nov. 16, 2023, as Appl. No. 18/511,025.
Application 18/511,025 is a continuation of application No. PCT/CN2022/091568, filed on May 7, 2022.
Claims priority of application No. 202110536623.0 (CN), filed on May 17, 2021.
Prior Publication US 2024/0087578 A1, Mar. 14, 2024
Int. Cl. G10L 19/008 (2013.01); G10L 19/16 (2013.01); H04S 7/00 (2006.01)
CPC G10L 19/008 (2013.01) [G10L 19/167 (2013.01); H04S 7/00 (2013.01); H04S 2420/11 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A three-dimensional audio signal encoding method performed by at least one processor, coupled with a memory configured to store a computer program, which when executed by the at least one processor, causes the at least one processor to perform the method, comprising:
obtaining a first correlation between a current frame of a three-dimensional audio signal and a representative virtual speaker set for a previous frame, wherein a virtual speaker in the representative virtual speaker set for the previous frame is used for encoding the previous frame of the three-dimensional audio signal, and the first correlation is used to determine whether to reuse the representative virtual speaker set for the previous frame when the current frame is encoded;
obtaining, after the obtaining the first correlation, a second correlation between the current frame and a candidate virtual speaker set, wherein the second correlation is used to determine whether the candidate virtual speaker set is used when the current frame is encoded, and the representative virtual speaker set for the previous frame is a proper subset of the candidate virtual speaker set;
encoding, by an encoder, the current frame based on the representative virtual speaker set for the previous frame when the first correlation satisfies a reuse condition, to obtain a bitstream, wherein the reuse condition comprise: the first correlation being greater than the second correlation; and
decoding, with a decoder communicatively coupled with the encoder, the bitstream generated by the encoder.