US 12,033,657 B2
Signal component estimation using coherence
Shiufun Cheung, Lexington, MA (US); Zukui Song, Wellesley, MA (US); Cristian Marius Hera, Lancaster, MA (US); and Davis Y. Pan, Arlington, MA (US)
Assigned to Bose Corporation, Framingham, MA (US)
Appl. No. 17/607,649
Filed by Bose Corporation, Framingham, MA (US)
PCT Filed Apr. 30, 2020, PCT No. PCT/US2020/030742
§ 371(c)(1), (2) Date Oct. 29, 2021,
PCT Pub. No. WO2020/223495, PCT Pub. Date Nov. 5, 2020.
Claims priority of provisional application 62/841,608, filed on May 1, 2019.
Prior Publication US 2022/0199105 A1, Jun. 23, 2022
Int. Cl. G10L 25/21 (2013.01); G10L 21/0232 (2013.01); H04R 3/04 (2006.01); G10L 21/0208 (2013.01); G10L 21/0216 (2013.01)
CPC G10L 25/21 (2013.01) [G10L 21/0232 (2013.01); H04R 3/04 (2013.01); G10L 2021/02082 (2013.01); G10L 2021/02163 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for estimating a power spectral density of a signal component, the method comprising:
receiving, at one or more processing devices, an input signal representing audio captured using a microphone, the input signal comprising at least a first portion that represents acoustic output from a first audio source in an environment, and a second portion that represents other acoustic energy in the environment;
computing, by the one or more processing devices, a frequency domain representation of the input signal that includes a cross-spectral density matrix based on the input signal and an output of the first audio source;
iteratively modifying, by the one or more processing devices, the frequency domain representation of the input signal by a matrix diagonalization process on the cross-spectral density matrix, such that the modified frequency domain representation represents a portion of the input signal in which effects due to all but a selected one of the first and second portion is substantially reduced;
determining, from the modified frequency domain representation, an estimate of a power spectral density of the selected portion; and
at least one of reducing noise or echo in a microphone signal based upon the estimated power spectral density or inserting noise in a far end system based upon the estimated power spectral density.