CPC G06F 8/73 (2013.01) [G06F 21/10 (2013.01); G06N 3/08 (2013.01); G06V 30/274 (2022.01)] | 20 Claims |
1. A method for processing a source code file, the method comprising:
scanning the source code file to identify text lines;
analyzing, via one or more processors, the text lines with a classifier to identify one or more of the text lines that correspond to license information, and wherein the classifier is trained with sample source code files;
generating a subset of the text lines, wherein the subset excludes the one or more of the text lines identified as corresponding to the license information;
determining whether text lines within the subset are open source code by comparing the subset to a database, wherein the database includes a plurality of text lines associated with open source code; and
outputting first text lines of the text lines that are the open source code with at least one or more of a security risk and a compliance risk for the source code file.
|