US 11,704,099 B1
Discovering matching code segments according to index and comparative similarity
Trevor Andrew Morse, Monroe, WA (US); Rama Krishna Sandeep Pokkunuri, Redmond, WA (US); and Matthew Lee, Long Island City, NY (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 31, 2022, as Appl. No. 17/710,528.
Int. Cl. G06F 8/40 (2018.01); G06F 8/41 (2018.01); G06F 8/75 (2018.01)
CPC G06F 8/425 (2013.01) [G06F 8/427 (2013.01); G06F 8/751 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising: at least one processor; and a memory, storing program instructions that when executed by the at least one processor, cause the at least one processor to implement:
receiving a request for a match for a code segment specified in a programming language;
parsing the code segment to generate a code structure representation for the code segment;
applying a hash function to the code structure representation to generate an index value;
accessing a data store using the index value to obtain a logic tree representation for a stored code segment specified in the programming language, wherein the logic tree representation is determined from respective tokens identified in the stored code segment;
generating a different logic tree representation from different respective tokens identified from the code segment; and
based on a comparison of the logic tree representation for the stored code segment with the different logic tree representation for the code segment, identifying the stored code segment as a match for the code segment.