| CPC G06N 3/08 (2013.01) [G06F 16/24578 (2019.01); G06F 40/216 (2020.01); G06N 7/01 (2023.01); G06F 16/338 (2019.01); G06N 3/006 (2013.01); G06N 3/02 (2013.01); G06N 3/084 (2013.01); G06N 7/023 (2013.01); G06N 7/026 (2013.01)] | 20 Claims |

|
1. A method, comprising:
using a scoring model to score each comment, of comments, based on corresponding features of the comment;
receiving, from a client device, a request to serve a subset of the comments comprising some but not all of the comments scored using the scoring model;
determining a plurality of possible rankings of the comments associated with a plurality of possible permutations, wherein the plurality of possible rankings of the comments comprises a first possible ranking of the comments associated with a first possible permutation and a second possible ranking of the comments associated with a second possible permutation;
responsive to the request to serve the subset of the comments comprising some but not all of the comments scored using the scoring model, selecting a ranking of the comments that is one permutation from the plurality of possible rankings of the comments, wherein selecting the ranking is in accordance with a probability distribution of the plurality of possible rankings that is based on scores of the comments, wherein the ranking of the comments is associated with increasing a reward at a given time by representing the comments in an order of the ranking, receiving one or more measurable reactions as a scalar reward for the ranking, and updating a ranking mechanism to increase the reward; and
serving one or more comments identified by the selected ranking over a network to the client device.
|