US 12,424,233 B2
Target source signal generation apparatus, target source signal generation method, and program
Rintaro Ikeshita, Tokyo (JP); Tomohiro Nakatani, Tokyo (JP); and Shoko Araki, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 18/265,909
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Dec. 14, 2020, PCT No. PCT/JP2020/046508
§ 371(c)(1), (2) Date Jun. 7, 2023,
PCT Pub. No. WO2022/130445, PCT Pub. Date Jun. 23, 2022.
Prior Publication US 2024/0038253 A1, Feb. 1, 2024
Int. Cl. G10L 21/0216 (2013.01); G06F 17/11 (2006.01); G06F 17/16 (2006.01)
CPC G10L 21/0216 (2013.01) [G06F 17/11 (2013.01); G06F 17/16 (2013.01)] 5 Claims
OG exemplary drawing
 
1. A sound source signal generation device in which K and M are integers satisfying 1≤K<M, x(f, t) (f=1, . . . , F, t=1, . . . , T, and F and T are integers satisfying 1≤F and 1≤T) (where f is an index indicating a frequency bin and t is an index indicating a time frame) is an observed signal of mixed sound from K sound sources observed using M microphones, xi(f, t) (i=1, . . . , K, f=1, . . . , F, t=1, . . . , T) is an i-th sound source signal, the i-th sound source signal being an estimation signal of an i-th sound source, W(f)=[w1(f), . . . , wK(f), WZ(f)] (where C is a set of complex numbers, wi(f)∈CM (i=1, . . . , K) is a separation filter for the i-th sound source signal, z represents noise signal, and WZ(f)∈CM×(M−K) is a separation filter for a noise signal) is a separation matrix, Vi(f) (i=1, . . . , K) is an auxiliary function of the i-th sound source signal, and VZ(f) is an auxiliary function of the noise signal,
the sound source signal generation device comprising
an initialization circuitry configured to initialize a separation matrix W(f) and an auxiliary function VZ(f);
an optimization circuitry configured to optimize the separation matrix W(f) using the observed signal x(f, t); and
a sound source signal generation circuitry configured to generate an i-th sound source signal xi(f, t) from the observed signal x(f, t) using the separation matrix W(f),
wherein the optimization circuitry includes
an auxiliary function calculation circuitry configured to calculate the auxiliary function Vi(f) (i=1, . . . , K) using the following equations;

OG Complex Work Unit Math
(where h represents complex conjugate transpose and si(f, t) is a complex number)

OG Complex Work Unit Math
(where si(t)=[si(1, t), . . . , si(F, t)]T is a vector and ri(t) is a real number)

OG Complex Work Unit Math
(where β is a predetermined constant and αiβ is a real number)

OG Complex Work Unit Math
(where φi(t) is a real number)

OG Complex Work Unit Math
a first separation filter calculation circuitry configured to calculate the separation filters wi(f) (i=1, . . . , K) using auxiliary functions Vi(f) (i=1, . . . , K) and Vz(f); and
a second separation filter calculation circuitry configured to calculate a separation filter WZ(f) according to a predetermined equation when a convergence condition is satisfied.