| CPC G06F 3/165 (2013.01) [G10L 15/22 (2013.01)] | 10 Claims |

|
1. An audio playback method, comprising:
obtaining original audio data to be played in real time; wherein the original audio data is a segment of an audio object currently being played on an original audio playback device;
performing floating-point sample sampling on the original audio data to obtain a plurality of pieces of floating-point sample audio data;
performing, by using a single instruction multiple data (SIMD) algorithm, transformation domain analysis in a parallel manner on the plurality of pieces of floating-point sample audio data according to a preset parallel quantity, to obtain a plurality of audio characteristic parameters; wherein each audio characteristic parameter is a Mel-Frequency Cepstral Coefficient (MFCC);
performing, by using an Algebraic Codebook Excited Linear Prediction (ACELP) encoding algorithm, quantization and coding on, and removing redundant information from, the plurality of audio characteristic parameters, to obtain a plurality of pieces of target encoded audio data;
performing encapsulation on the plurality of pieces of target encoded audio data to obtain a plurality of target audio frames, and synchronizing the plurality of target audio frames to all communication-capable audio playback devices; wherein each target audio frame comprises a sample rate, a bit depth and a quantity of channels;
in response to a switching command for switching an audio playback device, determining at least one candidate audio playback device indicated by the switching command; wherein the switching command is used to indicate an audio switching position;
performing optimal playback device recognition on the at least one candidate audio playback device based on a signal-to-noise ratio (SNR), an input sensitivity, a conversion rate, channel crosstalk, a common-mode rejection ratio (CMRR), a damping factor, volume, a silent mode switch, a switching priority and a sound effect mode of each candidate audio playback device, and ambient noise level of an environment where each candidate audio playback device is located, whether a talker is located in a specific space where any candidate audio playback device is located, and whether the environment where the candidate audio playback device is located is in a silent mode, to obtain a target audio playback device;
sending an audio switching playback command to the target audio playback device, to enable the target audio playback device to start audio playback based on the target encoded audio data from the audio switching position;
receiving playback information fed back by the target audio playback device in real time, analyzing and monitoring a playback state of the target audio playback device in real time according to the playback information, to obtain a monitoring result; wherein the playback state comprises playback progress, a buffering amount, CPU load of the target audio playback device; and
determining, based on the monitoring result, whether to adjust an audio frame synchronization strategy for the target audio playback device, and a specific audio frame synchronization strategy to be adjusted.
|