CPC G06V 20/20 (2022.01) [G06F 3/167 (2013.01); G06F 16/5854 (2019.01); G06F 16/90332 (2019.01); H04L 51/02 (2013.01); G06V 20/68 (2022.01)] | 17 Claims |
1. A method implemented by one or more processors,
comprising:
receiving, via an automated assistant interface of a client device, a voice input provided by a user;
determining, based on processing the voice input, that the voice input indicates a request, by the user, related to noise being generated by an object in an environment with the client device and the user;
in response to determining that the voice input indicates the request related to the noise:
processing audio data, that is captured via one or more microphones of the client device and that captures the noise being generated by the object, to determine one or more attributes of the noise being generated by the object;
determining whether the request is resolvable utilizing the one or more attributes of the noise being generated by the object;
in response to determining that the request is not resolvable utilizing the one or more attributes of the noise being generated by the object:
providing a prompt for presentation at the client device or an additional client device;
receiving, in response to the prompt, one or both of:
an image, of the object, captured by the client device or the additional client device, and
further voice input;
resolving the request utilizing the one or more attributes of the noise being generated by the object and based on processing of one or both of the image and the further voice input; and
causing output, that reflects the resolution of the request, to be rendered at the client device of the additional client device.
|