| CPC H04L 63/1416 (2013.01) [H04L 63/1441 (2013.01)] | 17 Claims |

|
1. A method comprising:
detecting, by a computing device, a uniform resource locator (URL) referencing an unknown website;
pre-processing the URL to determine a first probability that the unknown website is malicious, wherein pre-processing the URL comprises:
applying a classifier to the URL, the classifier comprising a set of branches, each branch configured to analyze a respective feature of the URL to output a result of whether the URL is related to a malicious website based on the respective feature; and
concatenating a respective result from each branch to generate a vector as an input to a machine learning model, the vector comprising the first probability of the URL being associated with a malicious website;
inputting the vector into the machine learning model, the machine learning model trained using Secure Sockets Layer (SSL) certificates of known legitimate websites and known malicious websites;
receiving, as output from the machine learning model, a second probability that the unknown website is malicious, wherein the second probability is based at least on a similarity between an SSL certificate associated with the unknown website and the SSL certificates of one or more of the known legitimate websites and known malicious website;
determining whether the second probability is associated with at least a threshold risk comprising a pre-determined probability value; and
responsive to determining that the second probability is associated with at least the threshold risk, causing a graphical user interface of a client device to display a notification indicating a security risk associated with the unknown website.
|