| CPC G06N 3/08 (2013.01) [G06N 3/063 (2013.01); G10L 15/26 (2013.01); G10L 25/54 (2013.01)] | 20 Claims |

|
1. A method of training a model, the method comprising:
configuring a first transformer for visual learning with a first set of weights;
configuring a second transformer for textual learning with a second set of weights;
adjusting at least the second set of weights based on minimizing a weight difference between the first set of weights and the second set of weights;
replacing the first set of weights for the first transformer with the adjusted second set of weights; and
updating the first transformer based on the adjusted second set of weights.
|