| CPC G10L 21/0208 (2013.01) [G06V 40/172 (2022.01); G06V 40/176 (2022.01); G06V 40/20 (2022.01); G10L 15/25 (2013.01); G10L 17/06 (2013.01); G10L 2021/02087 (2013.01); G10L 2021/02166 (2013.01)] | 19 Claims |

|
1. A method of creating a view of an environment, comprising:
accessing computer-readable instructions from one or more memory devices for
execution by one or more processors of a computing device;
executing the computer-readable instructions accessed from the one or more memory devices by the one or more processors of the computing device; and
wherein executing the computer-readable instructions further comprising:
receiving, at the computing device, voice files, visual effect files, facial expression files and/or mobility files;
receiving parameters and measurements from at least two of one or more microphones, one or more imaging devices, a radar sensor, a lidar sensor and/or one or more infrared imaging devices located in the computing device;
analyzing the parameters and measurements received from the multimodal input; generating a world map of the environment around the computing device, the world map including two or more users and objects;
repeating the receiving of parameters and measurements from the input devices and the analyzing steps on a periodic basis to maintain a persistent world map of the environment;
tracking the engagement of the two or more users utilizing the received parameters and measurements to determine the one or more users that are engaged with the computing device;
determining a noise level for the environment based on receipt of sounds and/or sound files from the two or more users and the environment; and
generating mobility commands to cause the computing device to move closer to a user that is speaking to the computing device if the environment is too noisy to hear the user that is speaking.
|