US 12,314,154 B2
Code execution trace generation with pre-trained large language model
Nan Duan, Beijing (CN); Shengyu Fu, Redmond, WA (US); Shuai Lu, Beijing (CN); Neelakantan Sundaresan, Bellevue, WA (US); and Alexey Svyatkovskiy, Bellevue, WA (US)
Assigned to Microsoft Technology Licensing, LLC., Redmond, WA (US)
Filed by MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed on Apr. 24, 2023, as Appl. No. 18/138,330.
Prior Publication US 2024/0354222 A1, Oct. 24, 2024
Int. Cl. G06F 11/362 (2025.01); G06N 3/08 (2023.01); G06N 20/00 (2019.01)
CPC G06F 11/3636 (2013.01) [G06N 3/08 (2013.01); G06N 20/00 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
a memory that stores one or more programs that are configured to be executed by the one or more processors, the one or more programs including instructions to perform actions that:
obtain a large language model trained on source code;
generate a first pre-training dataset including a first plurality of source code samples, each source code sample of the first plurality of source code samples paired with a corresponding code execution trace, wherein each of the source code samples of the first pre-training dataset having a single-line code execution, wherein the code execution trace represents dynamic behavior of the source code sample during execution of the source code sample;
train the large language model to learn to predict a code execution trace with the first pre-training dataset;
generate a second pre-training dataset including a second plurality of source code samples, each source code sample of the second plurality of source code samples paired with a corresponding code execution trace, wherein each of the source code samples of the second pre-training dataset having multiple-line code execution; and
train the large language model to learn to predict a code execution trace with the second pre-training dataset, wherein the large language model comprises a unified cross-modal neural transformer model with attention.