| CPC G06Q 50/01 (2013.01) [G06F 3/011 (2013.01); G06F 3/013 (2013.01); G06F 9/453 (2018.02); G06F 9/485 (2013.01); G06F 9/4862 (2013.01); G06F 9/4881 (2013.01); G06F 9/547 (2013.01); G06F 16/3329 (2019.01); G06F 16/90332 (2019.01); G06F 16/9536 (2019.01); G06F 18/2321 (2023.01); G06F 40/205 (2020.01); G06F 40/242 (2020.01); G06F 40/253 (2020.01); G06F 40/295 (2020.01); G06F 40/30 (2020.01); G06F 40/35 (2020.01); G06F 40/56 (2020.01); G06N 3/04 (2013.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01); G06Q 10/109 (2013.01); G06Q 30/0603 (2013.01); G06Q 30/0631 (2013.01); G06Q 30/0633 (2013.01); G06Q 30/0643 (2013.01); G06V 10/255 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 20/00 (2022.01); G06V 20/20 (2022.01); G06V 20/30 (2022.01); G06V 40/16 (2022.01); G06V 40/25 (2022.01); G10L 15/063 (2013.01); G10L 15/08 (2013.01); G10L 15/16 (2013.01); G10L 15/1815 (2013.01); G10L 15/1822 (2013.01); G10L 15/22 (2013.01); G10L 15/30 (2013.01); G10L 15/32 (2013.01); H04L 51/18 (2013.01); H04L 51/212 (2022.05); H04L 51/222 (2022.05); H04L 51/224 (2022.05); H04L 51/52 (2022.05); H04L 67/306 (2013.01); H04L 67/75 (2022.05); H04N 7/147 (2013.01); G06F 3/017 (2013.01); G06F 3/167 (2013.01); G06V 20/41 (2022.01); G06V 40/174 (2022.01); G06V 2201/10 (2022.01); G10L 2015/0631 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01); G10L 2015/227 (2013.01); G10L 2015/228 (2013.01)] | 20 Claims |

|
1. A method comprising, by one or more computing systems:
establishing a video call between a plurality of client systems, wherein access to an assistant system is persistently maintained during the video call;
receiving, from a first client system of the plurality of client systems, a request by a first user to be performed by the assistant system during the video call, wherein the request references one or more activities to be performed after the request and is associated with one or more users associated with the plurality of client systems, wherein the one or more activities referenced in the request are physical actions performed by the one or more users after the request is made;
analyzing, by a context engine of the assistant system after receiving the request from the first user, images of a scene of the video call to identify within the scene the one or more activities referenced in the request;
instructing the assistant system to execute the request responsive to the identification of one or more of the activities referenced in the request; and
sending, to one or more of the plurality of client systems, a response to the request while maintaining the video call between the plurality of client systems.
|