CPC H04L 63/101 (2013.01) [H04L 41/16 (2013.01)] | 20 Claims |
1. A computer-implemented method of choosing between alternative category labels tentatively assigned to webpages by a classifier ensemble running on processors, including:
applying the classifier ensemble including at least a sensitive category classifier, a non-sensitive category classifier, a title and metadata classifier and a heuristic classifier to at least tens of thousands of webpages;
applying a post processor to outputs of the classifier ensemble and, for at least some of the webpages, tentatively assigning at least two category labels for non-sensitive categories to produce tentatively assigned category labels;
for at least some of the webpages assigned the at least two category labels, automatically determining that at least one but not all of the tentatively assigned category labels is a general label and de-selecting the general label;
saving the assigned category label that is not de-selected to the webpage; and
distributing the assigned category labels for at least some of the tens of thousands of webpages for use in controlling access to webpages by users on user systems protected using the assigned category labels.
|