| CPC G06N 3/088 (2013.01) [G06F 8/30 (2013.01); G06F 40/40 (2020.01); G06N 3/045 (2023.01); G06F 8/427 (2013.01); G06F 8/71 (2013.01); G06N 3/04 (2013.01); G06N 3/063 (2013.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01)] | 20 Claims |

|
1. A system, comprising:
a processor; and
a memory that stores a program configured to be executed by the processor, the program including instructions to perform actions that:
provide a plurality of neural transformer models with attention trained on source code and/or natural language, wherein each of the plurality of neural transformer models with attention is associated with a standard memory size and a number of transformer blocks;
obtain a request to train a custom neural transformer model with attention from a select one of the plurality of neural transformer models with attention, wherein the request includes a custom memory size for the custom neural transformer model with attention; and
when none of the plurality of neural transformer models with attention meet the custom memory size:
select one of the plurality of neural transformer models with attention having a standard memory size larger than the custom memory size;
compute a scaling factor for the selected neural transformer model with attention to fit the custom memory size;
reduce a number of transformer blocks of the selected neural transformer model with attention based on the scaling factor to generate the custom neural transformer model with attention; and
train the custom neural transformer model with attention with a training dataset tailored for a software engineering task.
|