US 11,720,804 B2
Data-driven automatic code review
Anshul Gupta, Kirkland, WA (US); and Neelakantan Sundaresan, Bellevue, WA (US)
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed by MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed on Jul. 12, 2018, as Appl. No. 16/34,344.
Claims priority of provisional application 62/693,010, filed on Jul. 2, 2018.
Claims priority of provisional application 62/619,805, filed on Jan. 21, 2018.
Prior Publication US 2019/0228319 A1, Jul. 25, 2019
Int. Cl. G06N 5/025 (2023.01); G06F 8/35 (2018.01); G06F 8/73 (2018.01); G06F 11/36 (2006.01); G06N 5/04 (2023.01); G06N 3/045 (2023.01); G06N 7/01 (2023.01); G06F 8/30 (2018.01); G06F 8/41 (2018.01); G06F 8/71 (2018.01); G06F 8/75 (2018.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01); G06N 3/044 (2023.01)
CPC G06N 5/025 (2013.01) [G06F 8/30 (2013.01); G06F 8/35 (2013.01); G06F 8/42 (2013.01); G06F 8/71 (2013.01); G06F 8/73 (2013.01); G06F 8/75 (2013.01); G06F 11/3604 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01); G06N 5/04 (2013.01); G06N 7/01 (2023.01); G06F 11/3616 (2013.01); G06N 3/044 (2023.01); G06N 3/084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors and a memory;
one or more modules, wherein the one or more modules are configured to be executed by the one or more processors to perform actions that:
obtain a source code program having a plurality of source code snippets;
extract features representing an input source code snippet, the features including a context of the input source code snippet, wherein the context includes source code surrounding the input source code snippet in the source code program;
search a knowledge base having a plurality of source code snippets and corresponding code reviews, each source code snippet paired with a corresponding code review;
select from the knowledge base one or more code reviews having a close similarity to the input source code snippet and to the extracted features of the input source code snippet; and
input a representation of the extracted features and the one or more code reviews into a deep learning model to generate a probability for each of the one or more code reviews, the probability indicating a likelihood that a particular one of the one or more code reviews is relevant to the input source code snippet; and
select at least one code review from the one or more code reviews for the input source code snippet based on the corresponding probability that the at least one code review is relevant to the input source code snippet.