CPC H04L 63/1408 (2013.01) | 12 Claims |
1. A method of processing web requests directed to a website, the method including, at a system for processing web requests:
(i) for each of a plurality of web requests directed to a website, determining a request vector corresponding to the web request by applying a hash function to each web request to convert multiple predetermined features of each request into a request vector of a predefined size using hash values output by the hash function as indices of the request vector, wherein each request vector represents the multiple predetermined features of the respective web request;
(ii) clustering the request vectors by respectively assigning each request vector to one of a plurality of clusters using a clustering algorithm such that request vectors deemed to be similar to each other are assigned to a same cluster of the plurality clusters;
(iii) repeatedly updating the clustering of request vectors using the clustering algorithm such that the plurality of clusters dynamically change over time;
(iv) monitoring cluster metadata associated with each cluster as the plurality of clusters dynamically change over time, wherein the monitored cluster metadata associated with each cluster represents a current state of the cluster;
(v) identifying, based on the monitoring, any cluster meeting a predetermined anomaly criterion indicating that the cluster is displaying anomalous behaviour; and
(vi) triggering an investigation of a cluster identified as meeting the predetermined anomaly criterion,
wherein cluster metadata associated with each cluster includes a cluster vector based on the request vectors represented by the respective cluster, and a cluster weight based on a number of request vectors represented by the respective cluster, and
wherein updating the clustering of the request vectors includes:
updating the cluster metadata to reflect a current state of the cluster by applying a time decay algorithm to each cluster vector, wherein the time decay algorithm causes a magnitude of the cluster vector to decay with time; and
for each cluster vector: discarding a value of one or more indices of the cluster vector when the value is deemed insignificant.
|