US 12,314,865 B2
	Transfer learning system for automated software engineering tasks
Colin Bruce Clement, Seattle, WA (US); Dawn Drain, San Francisco, CA (US); Neelakantan Sundaresan, Bellevue, WA (US); and Alexey Svyatkovskiy, Bellevue, WA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed on Jan. 17, 2024, as Appl. No. 18/415,048.
Application 18/415,048 is a continuation of application No. 17/981,440, filed on Nov. 6, 2022, granted, now 11,900,261.
Application 17/981,440 is a continuation of application No. 16/917,267, filed on Jun. 30, 2020, granted, now 11,521,075.
Claims priority of provisional application 63/025,529, filed on May 15, 2020.
Prior Publication US 2024/0160940 A1, May 16, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/08 (2023.01); G06F 8/30 (2018.01); G06F 17/00 (2019.01); G06F 40/40 (2020.01); G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06N 3/088 (2023.01); G06F 8/41 (2018.01); G06F 8/71 (2018.01); G06N 3/063 (2023.01); G06N 20/00 (2019.01)

CPC G06N 3/088 (2013.01) [G06F 8/30 (2013.01); G06F 40/40 (2020.01); G06N 3/045 (2023.01); G06F 8/427 (2013.01); G06F 8/71 (2013.01); G06N 3/04 (2013.01); G06N 3/063 (2013.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01)]

20 Claims

1. A system, comprising:

a processor; and

a memory that stores a program configured to be executed by the processor, the program including instructions to perform actions that:

provide a plurality of neural transformer models with attention trained on source code and/or natural language, wherein each of the plurality of neural transformer models with attention is associated with a standard memory size and a number of transformer blocks;

obtain a request to train a custom neural transformer model with attention from a select one of the plurality of neural transformer models with attention, wherein the request includes a custom memory size for the custom neural transformer model with attention; and

when none of the plurality of neural transformer models with attention meet the custom memory size:

select one of the plurality of neural transformer models with attention having a standard memory size larger than the custom memory size;

compute a scaling factor for the selected neural transformer model with attention to fit the custom memory size;

reduce a number of transformer blocks of the selected neural transformer model with attention based on the scaling factor to generate the custom neural transformer model with attention; and

train the custom neural transformer model with attention with a training dataset tailored for a software engineering task.