| CPC G06F 16/24578 (2019.01) [G06F 16/248 (2019.01); G06F 16/93 (2019.01); G06N 20/00 (2019.01)] | 17 Claims |

|
1. A method for ranking documents in search results, the method comprising:
retrieving a plurality of search result sets, each search result set being associated with a user query and comprising an ordered plurality of documents;
defining a training data set based on the plurality of search result sets by, for each search result set:
determining an observation window including a pre-defined number of documents ordered after a responsive document from the search result set, the responsive document representative of a document selected by a user from the search result set; and
discarding documents from the search result set that are ordered below the pre-defined number of documents after the responsive document;
training a machine learning model via the training data set;
receiving a further user query;
presenting a list of responsive documents ranked by the trained machine learning model;
receiving indication of a responsive document from the presented list of responsive documents;
processing the list of responsive documents to discard documents from the list of responsive documents that are outside of the observation window from the indicated responsive document; and
adding the processed list of responsive documents to the training data set,
wherein, by not including the discarded documents, the training data set reduces a bias representative of a user selection of the responsive document over the discarded documents at the machine learning model.
|