US 11,990,150 B2
Method and device for audio repair and readable storage medium
Dong Xu, Guangdong (CN)
Assigned to Tencent Music Entertainment Technology (Shenzhen) Co., Ltd., Guangdong (CN)
Appl. No. 17/627,103
Filed by Tencent Music Entertainment Technology (Shenzhen) Co., Ltd., Guangdong (CN)
PCT Filed Jun. 28, 2019, PCT No. PCT/CN2019/093719
§ 371(c)(1), (2) Date Jan. 13, 2022,
PCT Pub. No. WO2020/228107, PCT Pub. Date Nov. 19, 2020.
Claims priority of application No. 201910397254.4 (CN), filed on May 13, 2019.
Prior Publication US 2022/0254365 A1, Aug. 11, 2022
Int. Cl. G10L 21/0264 (2013.01); G10L 21/0208 (2013.01); G10L 21/0216 (2013.01); G10L 21/0224 (2013.01); G10L 21/0232 (2013.01)
CPC G10L 21/0264 (2013.01) [G10L 21/0208 (2013.01); G10L 21/0216 (2013.01); G10L 21/0224 (2013.01); G10L 21/0232 (2013.01); G10L 2021/02087 (2013.01); G10L 2021/02163 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A method for audio repair, comprising:
inputting sequentially a plurality of audio frames into a cache module, the cache module being sequentially composed of a plurality of processing units, a processing unit located at a center of the plurality of processing units being a center processing unit;
assigning at least one audio frame contained in the center processing unit as a target frame;
detecting a noise point presented as a short-term high-energy pulse in the target frame according to audio characteristics of the plurality of audio frames in the cache module, wherein the detecting comprises:
determining a peak point of the target frame;
obtaining, from the cache module, an audio signal segment of a preset length centered on the peak point;
dividing the audio signal segment into a plurality of sections, wherein the plurality of sections comprise a first processing section, a second processing section, and a middle processing section between the first processing section and the second processing section, and the middle processing section comprises a first sub-section, a second sub-section, and a center sub-section between the first sub-section and the second sub-section;
extracting audio characteristics of the target frame and the plurality of sections respectively, wherein the audio characteristics comprise at least one of a peak value, signal energy, average power, a proportion of local peak, a roll-off rate of an autocorrelation coefficient, a sound intensity, or a peak duration; and
determining the noise point in the target frame according to the audio characteristics of the target frame and the plurality of sections, wherein the determining the noise point in the target frame comprises:
determining whether an amplitude value at the peak point of the target frame is greater than an amplitude value at a peak point of the center sub-section and an amplitude value at a peak point of the middle processing section;
determining whether the amplitude value at the peak point of the target frame is greater than an amplitude value at a peak point of the first sub-section and an amplitude value at a peak point of the second sub-section and a greater portion exceeds a first threshold;
determining whether signal energy of the middle processing section is greater than a second threshold;
determining whether a ratio of average power of the middle processing section to average power of the audio signal segment is greater than a third threshold;
determining whether a ratio of the amplitude value of the peak point of the target frame to a sum of amplitude values at peak points of the audio signal segment is greater than a fourth threshold;
determining whether the roll-off rate of the autocorrelation coefficient of the audio signal segment is greater than a fifth threshold;
determining whether a sound intensity of the middle processing section is greater than a sound intensity of the first processing section and a sound intensity of the second processing section;
determining whether a peak duration of the target frame is shorter than a sixth threshold; and
determining the peak point of the target frame as a noise point in the target frame if determination results are all positive; and
repairing the target frame to remove the noise point in the target frame.