CPC G06F 16/3344 (2019.01) [G06F 16/313 (2019.01); G06F 16/383 (2019.01); G06F 16/9035 (2019.01); G06F 16/908 (2019.01); G06F 16/954 (2019.01); G06Q 30/0627 (2013.01); G06Q 30/0629 (2013.01)] | 21 Claims |
1. A computer-implemented method, comprising:
extracting, by one or more processors of one or more computing devices, a product family name from each of a plurality of unstructured product titles associated with a plurality of products;
determining, by the one or more processors, a degree of similarity between model numbers of the plurality of products, wherein the determining the degree of similarity between the model numbers of the plurality of products comprises:
calculating an edit distance score between the model numbers of the plurality of products; and
determining that the edit distance score between the model numbers of the plurality of products is less than a first predetermined threshold, wherein the first predetermined threshold is determined based at least in part on a manufacturer or product type of the plurality of products;
determining, by the one or more processors, that at least two of the plurality of products are variants of one another by:
determining that the at least two of the plurality of products have a same extracted product family name; and
determining that the degree of similarity between the model numbers of the plurality of products is above a second predetermined threshold;
receiving, by the one or more processors, a user query associated with one of the at least two of the plurality of products; and
populating, in response to the user query associated with the one of the at least two of the plurality of products, a single webpage of a website with data indicative of the at least two of the plurality of products that are determined to be the variants of one another.
|