US 12,293,773 B2
	Automatically selecting a sound recognition model for an environment based on audio data and image data associated with the environment
Luca Bondi, Pittsburgh, PA (US); and Irtsam Ghazi, Pittsburgh, PA (US)
Assigned to Robert Bosch GmbH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on Nov. 3, 2022, as Appl. No. 18/052,507.
Prior Publication US 2024/0153524 A1, May 9, 2024
Int. Cl. G10L 25/51 (2013.01); H04R 1/08 (2006.01)

CPC G10L 25/51 (2013.01)

16 Claims

1. A system for automatically selecting a sound recognition model for an environment based on audio data and image data associated with the environment, the system comprising:

a camera;

a microphone;

a memory including a plurality of sound recognition models; and

an electronic processor configured to

receive, via an input device, a selection of one or more sound recognition tasks;

receive the audio data associated with the environment from the microphone;

receive the image data associated with the environment from the camera;

determine one or more characteristics of the environment based on the audio data and the image data;

for each of the one or more sound recognition tasks selected, select the sound recognition model from the plurality of sound recognition models based on the one or more characteristics of the environment and the selected sound recognition task;

receive additional audio data associated with the environment from the microphone; and

analyze the additional audio data using the sound recognition model to perform a sound recognition task, wherein the sound recognition task includes generating a prediction regarding the additional audio data.