| CPC G06F 16/35 (2019.01) [G06F 16/3334 (2019.01); G06F 16/955 (2019.01)] | 20 Claims |

|
1. A method, comprising:
receiving plural Uniform Resource Locators (URLs), each URL of the plural URLs corresponding to a respective webpage of a website;
accessing, from a database, a set of terms, the set of terms having been predetermined as prioritized;
extracting distinct terms corresponding to a path level, a query key and a cvar key for the plural URLs;
computing a similarity score of the distinct terms with the set of terms;
identifying, based on the computing, URLs of the plural URLs having at least one term appearing within the set of terms;
applying weights to the identified URLs, to prioritize the identified URLs relative to other URLs of the plural URLs;
performing, based on applying the weights, hierarchical clustering with respect to the plural URLs, to generate a dendrogram in which the plural URLs are arranged in hierarchical clusters;
storing a representation of the dendrogram;
automatically determining, based on the stored representation of the dendrogram, a predicted page group for each of the plural URLs; and
causing, based on determining the predicted page group for each of the plural URLs, display of metrics corresponding to the website.
|