| CPC G10L 19/008 (2013.01) | 14 Claims |

|
1. A method of audio encoding performed by an audio encoder, comprising:
selecting a first target virtual speaker from a preset virtual speaker set based on a first scene audio signal;
generating a first virtual speaker signal based on the first scene audio signal and attribute information of the first target virtual speaker;
obtaining a second scene audio signal using the attribute information of the first target virtual speaker and the first virtual speaker signal;
generating a residual signal based on the first scene audio signal and the second scene audio signal; and
encoding the first virtual speaker signal and the residual signal, to produce encoded signals, and writing the encoded signals into a bitstream;
wherein
the first scene audio signal comprises a higher order ambisonics (HOA) signal to be encoded, and the attribute information of the first target virtual speaker comprises location information of the first target virtual speaker; and
generating the first virtual speaker signal comprises:
obtaining an HOA coefficient for the first target virtual speaker based on the location information of the first target virtual speaker; and
performing linear combination on the HOA signal to be encoded and the HOA coefficient for the first target virtual speaker to obtain the first virtual speaker signal.
|