CPC G10L 21/0324 (2013.01) [G06F 3/0482 (2013.01); G10L 21/0224 (2013.01); G10L 21/10 (2013.01); G10L 25/51 (2013.01); G06F 2203/04806 (2013.01)] | 21 Claims |
1. A system for generating a sound detection score based on comparing a sound-generating apparatus producing incoming spoken audio content to a sound-generating apparatus that produced reference spoken audio content, the system comprising:
an input component receiving a plurality of digital samples of an input electronic audio signal generated based on capture by an audio transducer of incoming spoken audio content;
a transform component transforming the digital samples into a plurality of amplitude sequences, each amplitude sequence n respectively comprising a sequence of amplitudes of nth-most prominent frequency content in frames of the input electronic audio signal;
a test component testing at least one of the amplitude sequences to generate measurements, the testing comparing the measurements to one or more threshold parameter corresponding to a reference audio signal containing reference spoken audio content, to generate the sound detection score;
a datastore storing, in association with a reference sound identifier corresponding to the reference audio signal, one or more transform parameter, one or more test parameter, and the one or more threshold parameter; and
a configuration component configuring the transform component and the test component prior to the transforming and the testing based on the one or more transform parameter and the one or more test parameter, wherein the configuration component retrieves at least the one or more transform parameter, the one or more test parameter, and the one or more threshold parameter from the datastore based on the reference sound identifier.
|