US 11,941,052 B2
Online content evaluation system and methods
Dan Martinec, Nymburk (CZ); Yury Kasimov, Prague (CZ); and Juyong Do, Cupertino, CA (US)
Assigned to Avast Software s.r.o., Prague (CZ)
Filed by Avast Software s.r.o., Prague (CZ)
Filed on Jun. 8, 2021, as Appl. No. 17/342,463.
Prior Publication US 2022/0391445 A1, Dec. 8, 2022
Int. Cl. G06F 16/835 (2019.01); G06F 40/295 (2020.01); G06N 20/00 (2019.01); G06Q 30/0201 (2023.01)
CPC G06F 16/835 (2019.01) [G06F 40/295 (2020.01); G06N 20/00 (2019.01); G06Q 30/0201 (2013.01)] 34 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
detecting a webpage accessed by a user on a computing device via a browser;
determining content on the webpage, the content comprising a plurality of sentences comprising first text;
applying a first model to the content to determine a plurality of keyword sets;
performing a network search based on each of the plurality of keyword sets to generate a plurality of search results, the plurality of search results indicating a plurality of network locations comprising at least one claim comprising second text;
receiving a plurality of data for a plurality of universal resource locators (“URLs”) and a plurality of labels for the plurality of data;
training a second model based on the plurality of data for the plurality of URLs and the plurality of labels for the plurality of data;
comparing the plurality of search results to the content, the comparing of the plurality of search results to the content comprising:
generating word embeddings of the plurality of sentences and word embeddings of the second text; and
comparing the word embeddings of the plurality of sentences to the word embeddings of the second text of the at least one claim;
comparing the plurality of search results to each other, the comparing of the plurality of search results to each other comprising:
comparing the second text of at least one of the plurality of search results to others of the plurality of search results to determine a plurality of agreement scores;
generating a matrix based on the plurality of agreement scores; and
determining a first score of at least one of the plurality of search results based on an aggregating of the plurality of agreement scores based on the matrix;
determining a plurality of values based on the comparing the plurality of search results to the content;
applying the second model to the plurality of values to determine a second score of the at least one of the plurality of search results;
determining a factualness of the content at least based on the comparing of the word embeddings of the plurality of sentences to the word embeddings of the second text of the at least one claim, based on the first score of the at least one of the plurality of search results, and based on the second score of the at least one of the plurality of search results; and
notifying the user via the browser of the factualness of the content.