US 12,235,898 B2
Videochat
Nan Duan, Beijing (CN); Lei Ji, Beijing (CN); and Ming Zhou, Beijing (CN)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Feb. 20, 2024, as Appl. No. 18/582,092.
Application 18/582,092 is a continuation of application No. 17/286,427, granted, now 11,921,782, previously published as PCT/US2019/059292, filed on Nov. 1, 2019.
Claims priority of application No. 201811327202.1 (CN), filed on Nov. 8, 2018.
Prior Publication US 2024/0193208 A1, Jun. 13, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/70 (2019.01); G06F 16/732 (2019.01); G06F 16/735 (2019.01); G06F 16/74 (2019.01); G06F 16/783 (2019.01)
CPC G06F 16/7834 (2019.01) [G06F 16/732 (2019.01); G06F 16/735 (2019.01); G06F 16/745 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computing device comprising:
a hardware processor;
a memory, the memory storing instructions, which when executed, cause the hardware processor to perform operations comprising:
acquiring, from a conversational artificial intelligence session with a user, a user query;
performing a similarity matching calculation between the user query and each of a plurality of components of a first video, the components comprising audio, image, and subtitle data corresponding to the first video to create a plurality of first similarity scores, each of the plurality of first similarity scores describing a similarity between the user query and a single one of the plurality of components of the first video;
combining the plurality of first similarity scores to generate a combined similarity score;
selecting a video from a plurality of videos based upon the combined similarity score of the first video and a plurality of other combined similarity scores of other videos in the plurality of videos;
generating a query response including the selected video;
generating summary information of the selected video; and
causing the query response to be output on a display device of the user as part of a response to the user query in the conversational artificial intelligence session, the output including the selected video and including the summary information of the selected video.