US 12,444,403 B2
Channel selection apparatus, channel selection method, and program
Kazunori Kobayashi, Tokyo (JP); Shoichiro Saito, Tokyo (JP); and Hiroaki Ito, Tokyo (JP)
Assigned to NTT, INC., Tokyo (JP)
Appl. No. 17/274,394
Filed by NTT, INC., Tokyo (JP)
PCT Filed Aug. 28, 2019, PCT No. PCT/JP2019/033608
§ 371(c)(1), (2) Date Mar. 8, 2021,
PCT Pub. No. WO2020/054405, PCT Pub. Date Mar. 19, 2020.
Claims priority of application No. 2018-169551 (JP), filed on Sep. 11, 2018.
Prior Publication US 2022/0051657 A1, Feb. 17, 2022
Int. Cl. G10L 15/05 (2013.01); G10L 15/08 (2006.01); G10L 21/0272 (2013.01); G10L 25/78 (2013.01)
CPC G10L 15/05 (2013.01) [G10L 15/083 (2013.01); G10L 21/0272 (2013.01); G10L 25/78 (2013.01); G10L 2015/088 (2013.01)] 5 Claims
OG exemplary drawing
 
1. A channel selection apparatus comprising:
input circuitry configured to receive input voice signals of a voice captured by a plurality of microphones over a plurality of channels of input voice signals, at least two of the input voice signals including a predetermined keyword;
addition circuitry configured to add all the received input voice signals of the voice of the plurality of channels to generate a composite voice signal of one channel;
keyword detection circuitry configured to generate a keyword detection result indicating a result of detecting an utterance of the predetermined keyword from the composite voice signal;
power calculation circuitry configured to calculate first power of an input voice signal in each channel of the plurality of channels;
maximum power detection circuitry configured to, when the keyword detection result indicates that the predetermined keyword has been detected, select a channel having the maximum power among respective powers of the input voice signals of the voice over the plurality of channels of the input voice signals as an output channel;
second power calculation circuitry configured to calculate second power of an input voice signal in said each channel of the plurality of channels in a time segment obtained by tracing back a predetermined amount of time from the input voice signals;
weight calculation circuitry configured to calculate a weight having a value that increases the larger the first power output by the power calculation circuitry is than the second power output by the second power calculation circuitry,
wherein the maximum power detection circuitry is further configured to detect, as an output channel, the channel having the maximum power among the respective powers of the input voice signals of the voice over the plurality of channels, and the channel is obtained by weighting the powers of the input voice signals of the voice over the plurality of channels output by the power calculation circuitry with the weight; and
speech recognition circuitry configured to subject the output channel to speech recognition.