US 12,265,805 B2
Syntactically coherent code segmentation
Navneet Potti, Sunnyvale, CA (US); and Joshua Howland, Mountain View, CA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by GOOGLE LLC, Mountain View, CA (US)
Filed on Jan. 26, 2023, as Appl. No. 18/102,039.
Prior Publication US 2024/0256235 A1, Aug. 1, 2024
Int. Cl. G06F 9/44 (2018.01); G06F 8/41 (2018.01)
CPC G06F 8/433 (2013.01) [G06F 8/425 (2013.01); G06F 8/427 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method implemented by one or more processors and comprising:
processing source code to generate one or more graphs representing the source code;
traversing one or more of the graphs to identify one or more sequences of tokens within the source code that satisfy an input constraint of a trained machine learning model comprising a transformer network with an attention mechanism, wherein the input constraint comprises a limit on how many tokens can be processed during a single iteration of the transformer network;
segmenting the source code into the identified one or more sequences of tokens; and
processing the one or more sequences of tokens using the transformer network.