US 12,353,968 B2
Methods and systems for generating training data for computer-executable machine learning algorithm within a computer-implemented crowdsource environment
Nikita Vitalevich Pavlichenko, Moscow (RU); Valentina Pavlovna Fedorova, Sergiev Posad (RU); and Valentin Andreevich Biryukov, Orenburg (RU)
Assigned to Y.E. Hub Armenia LLC, Yerevan (AM)
Filed by YANDEX EUROPE AG, Lucerne (CH)
Filed on Feb. 14, 2022, as Appl. No. 17/670,662.
Claims priority of application No. 2021114640 (RU), filed on May 24, 2021.
Prior Publication US 2022/0374770 A1, Nov. 24, 2022
Int. Cl. G06N 20/00 (2019.01); G06V 10/764 (2022.01); G06V 10/774 (2022.01); G06V 10/776 (2022.01)
CPC G06N 20/00 (2019.01) [G06V 10/764 (2022.01); G06V 10/774 (2022.01); G06V 10/776 (2022.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method of generating training data for a computer-executable Machine Learning Algorithm (MLA), the training data being based on one or more digital tasks accessible by a plurality of assessors within a computer-implemented crowdsource environment, the method being executable by a server accessible over a communication network by electronic devices associated with the plurality of assessors, the method comprising:
accessing, by the server, assessor data associated with the plurality of assessors, the assessor data including information indicative of past performance of respective ones from the plurality of assessors when executing digital tasks of a first type and digital tasks of a second type, and wherein the assessor data comprises information indicating a difficulty of the digital tasks,
the digital tasks of the first type and the digital tasks of the second type being digital tasks of a first class of digital tasks;
generating, by the server, a first ranked list of assessors based on their past performance when executing the digital tasks of the first type, wherein generating the first ranked list comprises assigning a weighted coefficient to each previously executed task of the digital tasks of the first type based on a difficulty of the respective digital task and applying the weighted coefficient to a score of each of the previously executed digital tasks of the first type;
generating, by the server, a second ranked list of assessors based on their past performance when executing the digital tasks of the second type, wherein generating the second ranked list comprises assigning a weighted coefficient to each previously executed task of the digital tasks of the second type based on a difficulty of the respective digital task and applying the weighted coefficient to a score of each of the previously executed digital tasks of the second type;
for a given one of the plurality of assessors:
generating, by the server, a first score for the digital tasks of the first type using the first ranked list of assessors,
the first score being indicative of a past performance of the given one of the plurality of assessors when executing the digital tasks of the first type relative to the past performance of other ones from the plurality of assessors when executing the digital tasks of the first type;
generating, by the server, a second score for the digital tasks of the second type using the second ranked list of assessors,
the second score being indicative of a past performance of the given one of the plurality of assessors when executing the digital tasks of the second type relative to the past performance of other ones from the plurality of assessors when executing the digital tasks of the second type;
generating, by the server, a class score for the first class of digital tasks as a combination of the first score and the second score;
acquiring, by the server, a request for executing a digital task of a third type being different from the first type and the second type, wherein the digital task is of a second class different from the first class, and wherein the plurality of assessors have not previously completed any tasks of the second class;
ranking, by the server, the plurality of assessors based on respective class scores for the first class of digital tasks, the given one from the plurality of assessors being one of top ranked ones from the plurality of assessors;
transmitting, by the server over the communication network, the digital task of the third type to the electronic device associated with the given one from the plurality of assessors;
generating, by the server, the training data for the MLA based on a response from the given one from the plurality of assessors executing the digital task of the third type.