US 11,942,107 B2
	Voice activity detection with low-power accelerometer
Stefano Paolo Rivolta, Desio (IT); Federico Rizzardini, Settimo Milanese (IT); Lorenzo Bracco, Chivasso (IT); and Roberto Mura, Milan (IT)
Assigned to STMICROELECTRONICS S.r.l., Agrate Brianza (IT)
Filed by STMICROELECTRONICS S.r.l., Agrate Brianza (IT)
Filed on Feb. 23, 2021, as Appl. No. 17/183,288.
Prior Publication US 2022/0270593 A1, Aug. 25, 2022
Int. Cl. G10L 25/78 (2013.01); G06N 5/01 (2023.01); G06N 20/10 (2019.01); G10L 25/09 (2013.01); G10L 25/30 (2013.01); G10L 25/51 (2013.01); H04R 25/00 (2006.01); G10L 15/16 (2006.01); G10L 19/26 (2013.01)

CPC G10L 25/78 (2013.01) [G06N 5/01 (2023.01); G06N 20/10 (2019.01); G10L 25/09 (2013.01); G10L 25/30 (2013.01); G10L 25/51 (2013.01); H04R 25/40 (2013.01); H04R 25/505 (2013.01); H04R 25/604 (2013.01); G10L 15/16 (2013.01); G10L 19/26 (2013.01)]

14 Claims

1. A device, comprising:

an accelerometer configured to:

measure acceleration of the device along a plurality of axes;

generate a first acceleration signal based on the acceleration measured by the accelerometer;

apply a filter to the first acceleration signal;

determine a first characteristic of the filtered first acceleration signal along a first axis of the plurality of axes;

classify the first acceleration signal as a non-human speech signal in a case where the first characteristic does not satisfy a first determined condition;

determine a second characteristic of the filtered first acceleration signal along a second axis of the plurality of axes in a case where the first characteristic satisfies the first determined condition, the second characteristic being the same as the first characteristic;

classify the first acceleration signal as a non-human speech signal in a case where the second characteristic does not satisfy a second determined condition;

determine a third characteristic of the filtered first acceleration signal along a third axis of the plurality of axes in a case where the second characteristic satisfies the second determined condition, the third characteristic being different from the second characteristic, each of the first, second, and third characteristics being a peak-to-peak calculation, a zero crossing calculation, a peak count calculation, or a variance calculation;

classify the first acceleration signal as a non-human speech signal in a case where the third characteristic does not satisfy a third determined condition;

classify the first acceleration signal as a human speech signal in a case where the third characteristic satisfies the third determined condition;

determine a first count value that indicates a total number of times the first acceleration signal has been classified as a human speech signal; and

output a detection signal that indicates human speech is present or human speech is absent, the detection signal indicating human speech is present in a case where the first count value is equal to or greater than a first threshold count value; and

an operating system layer configured to receive the detection signal.