CPC H04L 63/0236 (2013.01) [G06N 20/00 (2019.01); H04L 63/0263 (2013.01); H04L 63/1416 (2013.01)] | 14 Claims |
1. A URL filtering system comprising:
a hardware processor; and a memory accessible by the processor, the memory having stored therein at least one of programs or instructions executable by the at least one processor to cause the filtering system to perform operations comprising:
receiving a URL request to access a resource associated with the URL;
performing a first layer of URL filtering by comparing the URL to a blocklist of URLs having respective malicious resources associated;
determining that the URL does not match a URL on the blocklist;
performing a second layer of filtering by applying a machine learning algorithm to analyze the URL to predict whether a resource associated with the URL is malicious, wherein the machine learning algorithm includes blocklist rules determined from patterns recognized in at least a portion of text of at least one URL in the blocklist, wherein the patterns include at least one of related words in a URL or a context of words in a URL;
determining that a resource associated with the URL is predicted to be malicious;
generating and transmitting a URL filter determination that the resource associated with the URL is malicious and updating the blocklist to include the URL; and
distributing the updated blocklist to at least one of an end user device or a web crawler.
|