US 11,734,242 B1
Architecture for resolution of inconsistent item identifiers in a global catalog
Karim Bouyarmane, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 31, 2021, as Appl. No. 17/219,329.
Int. Cl. G06F 16/00 (2019.01); G06F 16/22 (2019.01); G06F 16/29 (2019.01); G06Q 30/0601 (2023.01); G06F 16/14 (2019.01); G06F 16/242 (2019.01); G06F 16/35 (2019.01); G06F 16/51 (2019.01)
CPC G06F 16/2228 (2019.01) [G06F 16/14 (2019.01); G06F 16/2438 (2019.01); G06F 16/29 (2019.01); G06F 16/35 (2019.01); G06F 16/51 (2019.01); G06Q 30/0631 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A system comprising:
a computer-readable memory storing executable instructions; and
a processor in communication with the computer-readable memory and configured by the executable instructions to at least:
obtain a request to associate a first item with an item identifier in a global catalog, the global catalog identifying items and a plurality of sets of features;
determine a subset of the items associated with the item identifier in the global catalog, wherein the subset of the items are associated with a plurality of languages;
obtain item data corresponding to the subset of the items from the global catalog;
obtain a plurality of embedding output vectors for the subset of the items based at least in part on the item data, wherein the plurality of embedding output vectors represent features associated with the subset of the items;
generate a distribution of the plurality of embedding output vectors;
determine a plurality of clusters of items from the distribution of the plurality of embedding output vectors, wherein each cluster of the plurality of clusters of items comprises one or more items that are associated with the item identifier and share a respective set of features of the plurality of sets of features;
identify a primary cluster of items from the plurality of clusters of items, the primary cluster identifying a set of verified items consistent with the item identifier;
disassociate a secondary cluster of items of the plurality of clusters of items with the item identifier in the global catalog based at least in part on the distribution of the plurality of embedding output vectors;
determine that the first item corresponds to the primary cluster;
in response to determining that the first item corresponds to the primary cluster, associate the first item with the item identifier in the global catalog;
determine that a second item does not correspond to the primary cluster; and
in response to determining that the second item does not correspond to the primary cluster, deny a request to associate the second item with the item identifier in the global catalog.