US 12,468,944 B2
Method and system for learning to share weights across transformer backbones in vision and language tasks
Burak Uzkent, Mountain View, CA (US); Vasili Ramanishka, Mountain View, CA (US); Yilin Shen, Santa Clara, CA (US); and Hongxia Jin, San Jose, CA (US)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Sep. 8, 2022, as Appl. No. 17/940,709.
Claims priority of provisional application 63/319,732, filed on Mar. 14, 2022.
Prior Publication US 2023/0289590 A1, Sep. 14, 2023
Int. Cl. G06N 3/08 (2023.01); G06N 3/063 (2023.01); G10L 15/26 (2006.01); G10L 25/54 (2013.01)
CPC G06N 3/08 (2013.01) [G06N 3/063 (2013.01); G10L 15/26 (2013.01); G10L 25/54 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of training a model, the method comprising:
configuring a first transformer for visual learning with a first set of weights;
configuring a second transformer for textual learning with a second set of weights;
adjusting at least the second set of weights based on minimizing a weight difference between the first set of weights and the second set of weights;
replacing the first set of weights for the first transformer with the adjusted second set of weights; and
updating the first transformer based on the adjusted second set of weights.