| CPC G01S 15/04 (2013.01) [G01S 15/86 (2020.01); G10L 25/78 (2013.01)] | 20 Claims |

|
1. A device comprising:
a microphone;
a speaker;
one or more processors; and
one or more memories storing instructions that, upon execution by the one or more processors, configure the device to:
receive first audio data generated by the microphone, the first audio data representing first audio having a first frequency of less than 20 KHz;
receive, second audio data generated by the microphone, the second audio data representing second audio having a second frequency of more than 20 KHz, the second audio being emitted by the speaker;
generate, by at least using the first audio data and a first presence detection algorithm, first presence detection data indicating that a voice event is detected within a space;
generate, by at least using the second audio data and a second presence detection algorithm, second presence detection data indicating that a motion event is detected within the space;
generate third presence detection data by at least using the first presence detection data, the second presence detection data, and a fusion model, the fusion model configured to generate the third presence detection data by at least using latent variables and observed variables, the observed variables corresponding to the first presence detection data and the second presence detection data, the latent variables corresponding to sensor-triggering events that include the voice event and the motion event;
determine that the third presence detection data indicates that an object is present within the space; and
cause an action associated with the object to be performed.
|