CPC G06N 3/08 (2013.01) | 16 Claims |
1. A method of training a model, the method comprising:
acquiring a recognition result of a teacher model and a recognition result of a student model for an input sequence;
determining an adversarial loss based on a degree to which an output sequence of the teacher model and an output sequence of the student model that are respectively output as recognition results for the input sequence are distinguished from each other; and
training the student model to reduce the adversarial loss,
wherein the determining of the adversarial loss comprises determining the adversarial loss by applying a Gumbel-max based on a probabilities of elements included in the output sequence of the teacher model.
|