CPC G06F 40/166 (2020.01) [G06F 16/3344 (2019.01); G06F 16/345 (2019.01); G06F 40/205 (2020.01); H04L 51/216 (2022.05); G06V 20/47 (2022.01)] | 15 Claims |
1. A method of multi-modal summarization of communication on a messaging platform, the method comprising:
receiving, via a user interface, a user request for summarizing communication messages relating to a topic between a first user and a second user on a messaging platform;
searching, via a search engine, for messages between the first user and the second user on the messaging platform;
filtering the messages based on the topic from the user request by predicting, via a topic classification model, whether the messages are related to the topic and excluding a subset of messages that are predicted to be unrelated to the topic;
generating, by a text encoder of a multi-modal summarization model, a text representation from an input sequence corresponding to textual content from the filtered messages;
generating, by an image encoder of the multi-modal summarization model, an image representation of visual features in a multimedia attachment file from the filtered messages;
generating, via a decoder of the multi-modal summarization model, a text summary summarizing both the filtered messages and the multimedia attachment file based on a combination of the text representation generated by the text encoder and the image representation generated by the image encoder,
wherein the text summary comprises a text that references a time and/or a sender of the multimedia attachment file generated from metadata of the multimedia attachment file; and
transmitting, via the user interface, the generated text summary relating to the topic in response to the user request.
|