US 11,900,064 B2
Neural network-based semantic information retrieval
Aaron Sisto, Oakland, CA (US); Nick Martin, Saint Augustine, FL (US); Brian Shin, Reno, NV (US); and Hung Nguyen, St. Rafael, CA (US)
Assigned to Searchable AI Corp, San Francisco, CA (US)
Filed by Searchable AI Corp, San Francisco, CA (US)
Filed on Nov. 22, 2021, as Appl. No. 17/532,896.
Application 17/532,896 is a continuation of application No. 17/323,465, filed on May 18, 2021, granted, now 11,182,433.
Prior Publication US 2022/0083603 A1, Mar. 17, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/04 (2023.01); G06F 40/30 (2020.01); G06F 16/9032 (2019.01); G06V 30/148 (2022.01)
CPC G06F 40/30 (2020.01) [G06F 16/90332 (2019.01); G06N 3/04 (2013.01); G06V 30/153 (2022.01)] 12 Claims
OG exemplary drawing
 
1. A computer program product in a non-transitory computer-readable medium for use in a data processing system for information and retrieval, the computer program product holding computer program instructions that, when executed by the data processing system, are configured to:
receive a corpus of documents associated with a user, wherein the documents are structured in two or more distinct formats;
for each document in the corpus, process the document to identify a set of information strings and, for each information string, encode at least a portion of the information string into an n-dimensional semantic vector;
store the n-dimensional semantic vectors for each document;
upon receipt of a query, process the query into an n-dimensional semantic query vector;
compare the n-dimensional semantic query vector against the stored n-dimensional vectors for each document and, in response, identifying a set of candidate n-dimensional vectors that represent a possible answer to the query, wherein identifying the set of candidate n-dimensional vectors applies a neural filter that has been trained against a dataset of question-answer data structured as groupings of candidate sentences for an example query, wherein for a given training example the neural filter is trained to identify a particular candidate sentence that includes an answer to the example query while remaining candidate sentences that do not include the answer are characterized by the neural filter as contrasting;
rank the candidate n-dimensional vectors; and
return as an answer to the query a data string represented by a given highest ranked candidate n-dimensional vector.