CPC G10L 25/21 (2013.01) [G10L 25/18 (2013.01); G10L 25/24 (2013.01); G10L 25/78 (2013.01); H04R 3/00 (2013.01); H04R 2420/07 (2013.01); H04R 2430/03 (2013.01)] | 19 Claims |
1. A computer-implemented method for identifying at least one audio signal, the method comprising the steps of:
receiving audio data at a receiver module from at least one audio sensor; and
processing the audio data using a signal recognition module;
wherein processing the audio data using the signal recognition module comprises:
based on the received audio data, determining at least one of:
one or more time-varying vector arrays of octave band energies, and
one or more time-varying vector arrays of fractional octave band energies;
determining one or more time-varying vector arrays of Mel-Frequency Cepstral Coefficients (MFCC) values based on the received audio data;
generating audio feature image data based on the one or more time-varying vector arrays of MFCC values, and at least one of:
the one or more time-varying vector arrays of octave band energies, and
the one or more time-varying vector arrays of fractional octave band energies;
wherein the audio feature image data is generated by combining vector values of the one or more time-varying vector arrays of MFCC values and at least one of:
vector values of the one or more time-varying vector arrays of octave band energies, and
vector values of the one or more time-varying vector arrays of fractional octave band energies
into a single matrix; and
identifying at least one audio signal using a first model based on the audio feature image data;
wherein the first model comprises an image recognition model to identify a pattern in the audio feature image data to identify the at least one audio signal.
|