CPC G06F 16/338 (2019.01) [G06F 16/353 (2019.01)] | 20 Claims |
1. A system comprising:
one or more processors; and
one or more non-transitory computer-readable media storing computing instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising:
parsing one or more web texts;
determining a respective web text sentiment score for each respective web text of the one or more web texts, as parsed;
creating a ranked list of one or more match words in the one or more web texts;
scoring the one or more match words in the ranked list of the one or more match words;
creating a report covering a predetermined period of time using the one or more match words, as scored, in the ranked list;
extracting one or more topics from the report covering the predetermined period of time;
labeling, using a generative model, the one or more match words to create labeled training data, wherein the generative model is configured to determine a distribution of each label among respective data points prior to assigning respective labels to respective training data;
training a word-based classifier using the labeled training data to identify non-conforming web text submitted to a website for display, wherein the one or more web texts comprises the non-conforming web text;
determining a word-based classifier score using the word-based classifier;
determining an image-based classifier score using an image-based classifier;
combining the word-based classifier score with the image-based classifier score to create a hybrid score; and
automatically removing the non-conforming web text when at least one of the word-based classifier score, the image-based classifier score, or the hybrid score exceeds a predetermined threshold submitted to the website for display.
|