US 11,868,380 B1
Systems and methods for large-scale content exploration
Christina Pavlopoulou, Emeryville, CA (US); Manish Gupta, New York, NY (US); and Russ Thompson, Emeryville, CA (US)
Assigned to Amazon Technologies, Inc., Reno, NV (US)
Filed by Amazon Technologies, Inc., Reno, NV (US)
Filed on Aug. 7, 2019, as Appl. No. 16/534,798.
Int. Cl. G06F 16/00 (2019.01); G06F 16/332 (2019.01); G06F 17/16 (2006.01); G06F 16/35 (2019.01); G06F 16/9538 (2019.01); G06F 40/30 (2020.01); G06F 40/295 (2020.01)
CPC G06F 16/3323 (2019.01) [G06F 16/3329 (2019.01); G06F 16/358 (2019.01); G06F 16/9538 (2019.01); G06F 17/16 (2013.01); G06F 40/295 (2020.01); G06F 40/30 (2020.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
accessing a dataset including content from at least one online document;
determining, based on the content and at least in part on associated terms having word2vec similarity, a hierarchical topic model including at least one category and at least one subcategory, the at least one category and the at least one category forming a semantic keyword hierarchy;
allocating data in the content according to the hierarchical topic model;
receiving a user search query through a computing device interface;
determining, based at least in part on the data allocated according to the hierarchical topic model, a set of search results, the set of search results including at least one direct result semantically relevant to the search query and at least one exploratory result, which is complementary and semantically unrelated to the search query, the at least one exploratory result determined based on an association between the content and the at least one category; and
presenting the set of search results to the user through the computing device interface, the presentation including the at least one category and the at least one subcategory.