US 12,314,707 B2
Pre-training for automating code review activities
Nan Duan, Beijing (CN); Shengyu Fu, Redmond, WA (US); Shuai Lu, Beijing (CN); Neelakantan Sundaresan, Bellevue, WA (US); and Alexey Svyatkovskiy, Bellevue, WA (US)
Assigned to Microsoft Technology Licensing, LLC., Redmond, WA (US)
Filed by MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed on Nov. 12, 2022, as Appl. No. 17/985,849.
Prior Publication US 2024/0160435 A1, May 16, 2024
Int. Cl. G06F 8/71 (2018.01); G06N 3/08 (2023.01)
CPC G06F 8/71 (2013.01) [G06N 3/08 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
searching a source repository for a plurality of code change snippets and a plurality of code reviews, one or more of the plurality of code reviews associated with select ones of the code change snippets;
obtaining each of the plurality of code change snippets in a code diff format, wherein the code diff format includes one or more tags, wherein a tag represents an edit made to an original code associated with a select code change snippet of the plurality of code change snippets;
transforming each of the plurality of code change snippets into a code diff hunk, wherein the code diff hunk includes changed code and surrounding context;
randomly denoising each of the code diff hunks of the plurality of code change snippets and each of the plurality of code reviews; and
generating a pre-trained deep learning model from the randomly denoised code diff hunks of the plurality of code change snippets and the randomly denoised code reviews, wherein the pre-trained deep learning model predicts a code review given an input code diff hunk.