US 12,353,456 B2
	Systems, methods, and apparatus for context-driven search
Vamsi Krishna Banda, Chicago, IL (US)
Assigned to Encyclopaedia Britannica, Inc., Chicago, IL (US)
Filed by Encyclopaedia Britannica, Inc., Chicago, IL (US)
Filed on Apr. 26, 2021, as Appl. No. 17/240,679.
Claims priority of provisional application 63/016,751, filed on Apr. 28, 2020.
Prior Publication US 2021/0334300 A1, Oct. 28, 2021
Int. Cl. G06F 16/334 (2025.01); G06F 16/338 (2019.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01)

CPC G06F 16/3344 (2019.01) [G06F 16/3347 (2019.01); G06F 16/338 (2019.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)]

27 Claims

1. An apparatus comprising:

memory to store machine-readable instructions; and

at least one processor to execute the machine-readable instructions to at least:

tokenize text from content into text portions, the text portions including a first text portion and a second text portion;

execute a first machine-learning model with the text portions as input to the first machine-learning model to encode the first text portion to define a first vector and encode the second text portion to define a second vector, the first machine-learning model to output the first vector and the second vector, the first machine-learning model trained with contextually similar text data and context corresponding to the contextually similar text data, the context associated with a pattern represented by the contextually similar text data;

determine, based on a comparison between the first vector and the second vector, a natural language similarity between the first text portion and the second text portion;

in response to the natural language similarity satisfying a threshold, combine the first text portion and the second text portion to generate a third text portion;

encode the third text portion to define a third vector different from the first vector and the second vector;

cause storage of the third vector to define a vector database;

query the vector database to extract related search results from the vector database by comparing the third vector to a query vector corresponding to a query; and

execute a second machine-learning model different from the first machine-learning model with the related search results as input to the second machine-learning model to generate rankings of the related search results as output from the second machine-learning model for presentation on a computing device, the second machine-learning model trained with data corresponding to at least one query and search results associated with the at least one query.