US 11,676,008 B2
	Parameter-efficient multi-task and transfer learning
Mark Sandler, Mountain View, CA (US); Andrey Zhmoginov, Mountain View, CA (US); Andrew Gerald Howard, Culver City, CA (US); and Pramod Kaushik Mudrakarta, Chicago, IL (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Sep. 20, 2019, as Appl. No. 16/577,698.
Claims priority of provisional application 62/737,763, filed on Sep. 27, 2018.
Prior Publication US 2020/0104706 A1, Apr. 2, 2020
Int. Cl. G10L 15/16 (2006.01); G06N 3/08 (2023.01); G06N 3/045 (2023.01)

CPC G06N 3/08 (2013.01) [G06N 3/045 (2023.01); G10L 15/16 (2013.01)]

26 Claims

1. A computer-implemented method, the method comprising:

obtaining, by one or more computing devices, a machine-learned model that has been previously trained on a first training dataset to perform a first task, the machine-learned model including a first set of learnable parameters;

modifying, by the one or more computing devices, the machine-learned model to include a model patch, the model patch including a second set of learnable parameters, wherein the machine-learned model comprises a plurality of layers, and at least some the second set of learnable parameters included in the model patch comprise one or both of scale and bias parameters for one or more layers of the plurality of layers; and

after modifying the machine-learned model to include the model patch, training, by the one or more computing devices, the machine-learned model on a second training dataset to perform a second task that is different from the first task, wherein training, by the one or more computing devices, the machine-learned model on the second training dataset to perform the second task comprises learning new values for the second set of learnable parameters included in the model patch.