US 11,990,058 B2
Machine grading of short answers with explanations
Murali krishna teja Kilari, San Francisco, CA (US); Shane Curtis Mooney, San Mateo, CA (US); and Lingfeng Cheng, Burlingame, CA (US)
Assigned to Quizlet, Inc., San Francisco, CA (US)
Filed by QUIZLET, INC., San Francisco, CA (US)
Filed on Sep. 19, 2022, as Appl. No. 17/947,944.
Application 17/947,944 is a continuation of application No. 17/501,429, filed on Oct. 14, 2021, granted, now 11,450,225.
Prior Publication US 2023/0120965 A1, Apr. 20, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G09B 7/04 (2006.01); G06F 17/18 (2006.01); G06F 40/20 (2020.01); G06N 20/20 (2019.01)
CPC G09B 7/04 (2013.01) [G06F 17/18 (2013.01); G06F 40/20 (2020.01); G06N 20/20 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
digitally storing, in memory of a server computer, a plurality of machine learning models, the plurality of machine learning models comprising a plurality of Teacher models and a Student model, each of the machine learning models comprising a multi-layer bidirectional Transformer encoder and having been trained with at least one corpus of unlabeled training data using Masked Language Modeling;
updating, in the memory of the server computer, each Teacher model by further programmatically training that Teacher model to perform an Automatic Short Answer Grading task with a labeled ground truth data set, the labeled ground truth data set comprising a plurality of data triplets, each data triplet comprising a response text, a corresponding reference answer text, and a corresponding binary label;
executing each of the Teacher models to cause programmatically generating and storing, in the memory of the server computer, a respective set of class probabilities on an unlabeled task-specific data set for the Automatic Short Answer Grading task;
updating, in the memory of the server computer, the Student model by further programmatically training the Student model, with the unlabeled task-specific data set, to minimize an error between predictions of the Student model and predictions of a linear ensemble of the Teacher models;
receiving, at the server computer, digital input comprising a target response text and a corresponding target reference answer text;
programmatically inputting the target response text and the corresponding target reference answer text to the Student model, thereby outputting a corresponding predicted binary label;
causing to be displayed, in a graphical user interface displayed on a device display of a client computing device, correction data indicating the corresponding predicted binary label.