US 12,315,526 B2
Method and apparatus for determining echo, and storage medium
Nan Xu, Beijing (CN); Saisai Zou, Beijing (CN); and Li Chen, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Filed on Dec. 2, 2022, as Appl. No. 18/061,151.
Claims priority of application No. 202111480836.2 (CN), filed on Dec. 6, 2021.
Prior Publication US 2023/0096150 A1, Mar. 30, 2023
Int. Cl. G10L 21/00 (2013.01); G10L 21/0208 (2013.01)
CPC G10L 21/0208 (2013.01) [G10L 2021/02082 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for determining an echo, comprising:
obtaining an echo estimation result by performing echo estimation on an original audio signal;
obtaining an optimization processing result by performing optimization processing on the echo estimation result, wherein, the optimization processing comprises at least one of amplitude dimension optimization processing, phase dimension optimization processing, or time domain dimension optimization processing; and
determining an echo of the original audio signal using the optimization processing result;
wherein performing the optimization processing on the echo estimation result comprises:
obtaining an echo extraction result by performing echo extraction on the original audio signal using the echo estimation result;
performing signal processing on the echo extraction result to convert the echo extraction result to a time domain waveform; and
obtaining a fourth adjustment value by inputting the time domain waveform into a pre-trained time domain optimization model; wherein the fourth adjustment value is configured to adjust the echo estimation result in a time domain dimension;
wherein the time domain optimization model is obtained by training based on time domain waveforms which are determined according to a voice signal sample with an echo and a voice signal sample removing the echo, the voice signal sample removing the echo is a sample obtained by removing the echo from the voice signal sample with the echo.