CPC G10L 15/16 (2013.01) [G10L 25/24 (2013.01); G10L 2015/088 (2013.01)] | 17 Claims |
1. A method of operation of an apparatus for keyword spotting, the method comprising:
obtaining, from an input voice, an input feature map;
wherein lengths in a channel direction of the input feature map are independently determined for a plurality of sections;
wherein the plurality of sections is obtained by dividing the input voice by a predetermined period; and
wherein each length is defined based on frequency data extracted from a corresponding section of the input voice and corresponds to frequency value for the corresponding section;
performing a convolution operation between the input feature map and at least one filter;
wherein performing the convolution operation comprises performing a first convolution operation between the input feature map and each of n different filters;
wherein the n different filters cover a frequency range of the input feature map and each have a channel length that is the same as the channel length for the input feature map; and
wherein the channel length is above zero;
storing a result of the convolution operation as an output feature map; and
extracting a keyword from the input voice based on the output feature map,
wherein each filter of the n different filters used in the first convolution operation is configured to distinguish characteristics of different voices corresponding to letter sounds.
|