US 10,891,943 B2
Intelligent short text information retrieve based on deep learning
Jinren Zhang, Nanjing (CN); Ke Xu, Nanjing (CN); Zhen Fan, Nanjing (CN); and Bo Chen, Nanjing (CN)
Assigned to Citrix Systems, Inc., Fort Lauderdale, FL (US)
Filed by Citrix Systems, Inc., Fort Lauderdale, FL (US)
Filed on Jan. 18, 2018, as Appl. No. 15/874,119.
Prior Publication US 2019/0221204 A1, Jul. 18, 2019
Int. Cl. G10L 15/16 (2006.01); G06F 17/16 (2006.01); G06N 3/04 (2006.01); G06F 16/93 (2019.01); G06F 16/483 (2019.01); G06F 40/30 (2020.01)
CPC G10L 15/16 (2013.01) [G06F 16/483 (2019.01); G06F 16/93 (2019.01); G06F 17/16 (2013.01); G06F 40/30 (2020.01); G06N 3/0454 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method to retrieve content based on text input, comprising:
receiving, by a data processing system, a request comprising a plurality of terms;
determining, by a vector generator executed by the data processing system, an average of a plurality of word vectors, the plurality of word vectors including a word vector retrieved for each term of the plurality of terms of the request, the plurality of word vectors generated by multiplying an encoded vector for a respective term of the plurality of terms by a matrix of weights provided by at least one intermediate layer of a neural network;
generating, by the vector generator using the average of the plurality of word vectors, a sentence vector to map the request to a first vector space;
retrieving, from a database by the vector generator, a plurality of trained sentence vectors corresponding to a plurality of candidate electronic documents, wherein each of the plurality of trained sentence vectors map a respective sentence of each of the plurality of candidate electronic documents to the first vector space;
determining, by a scoring engine executed by the data processing system, a distance in the first vector space between the sentence vector and each trained sentence vector of the plurality of trained sentence vectors;
generating, by the scoring engine, a similarity score for each of the plurality of trained sentence vectors based on the respective one of the plurality of trained sentence vectors and the sentence vector and the distance in the first vector space between the sentence vector and each trained sentence vector of the plurality of trained sentence vectors;
selecting, by the scoring engine, an electronic document from the plurality of candidate electronic documents based on a ranking of the similarity score of each of the plurality of trained sentence vectors; and
providing, by the data processing system, the electronic document.