| CPC G10L 17/18 (2013.01) [G10L 21/034 (2013.01); G10L 25/78 (2013.01); H04R 1/406 (2013.01); H04R 3/005 (2013.01)] | 12 Claims |

|
1. A signal processing device, comprising:
a main speech detection unit configured to:
receive a first signal from a first sound collection device, a second signal from a second sound collection device, and a third signal from a third sound collection device, wherein
the first sound collection device is associated with a first speaker,
the second sound collection device is associated with a second speaker, and
the third sound collection device is associated with a third speaker;
input, to a first neural network, the first signal and the second signal to obtain first information;
input, to a second neural network, the first signal and the third signal to obtain second information;
input, to a third neural network, the second signal and the third signal to obtain third information;
detect, by integration of the first information and the second information, a presence or an absence of a main speech of the first speaker in the first signal;
output first frame information indicating the presence or the absence of the main speech of the first speaker in the first signal;
detect, by integration of the first information and the third information, a presence or an absence of a main speech of the second speaker in the second signal;
output second frame information indicating the presence or the absence of the main speech of the second speaker in the second signal;
detect, by integration of the second information and the third information, a presence or an absence of a main speech of the third speaker in the third signal; and
output third frame information indicating the presence or the absence of the main speech of the third speaker in the third signal.
|