US 11,928,427 B2
Linguistic analysis of seed documents and peer groups
Lewis C. Lee, Atherton, CA (US)
Assigned to AON RISK SERVICES, INC. OF MARYLAND, New York, NY (US)
Filed by AON RISK SERVICES, INC. OF MARYLAND, New York, NY (US)
Filed on Dec. 8, 2020, as Appl. No. 17/115,249.
Prior Publication US 2022/0180059 A1, Jun. 9, 2022
Int. Cl. G06F 40/253 (2020.01); G06F 16/93 (2019.01); G06F 40/30 (2020.01)
CPC G06F 40/253 (2020.01) [G06F 16/93 (2019.01); G06F 40/30 (2020.01); G06F 2216/11 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a seed document including content;
determining that the seed document is a patent document and that the content includes a patent claim, wherein the patent document is classified according to a classification;
generating, based upon a natural language processing technique, noun phrases for the patent claim within the seed document, wherein the natural language processing technique segments the patent claim, performs syntactic parsing, generates tagging data associated with the patent claim, and utilizes a claim-specific grammar model to generate the noun phrases;
generating a trained machine learning model configured to identify elements of patent claims, the trained machine learning model trained on a document corpus specific to the seed document;
analyzing the noun phrases to identify an element of the patent claim, the element comprising a plurality of syntactically related concepts in the patent claim, the element being one of a plurality of elements of the patent claim, and the element including a plurality of words, wherein the analyzing is performed utilizing the trained machine learning model;
determining a breadth of the element with respect to one or more additional elements of the plurality of elements of the patent claim;
generating, based at least partly on the breadth of the element, a search string that includes one or more of the words of the plurality of words of the element;
sending the search string to a third-party searching authority;
obtaining a plurality of additional patent documents, the plurality of additional patent documents being classified according to the classification and the plurality of additional patent documents including a first set of additional patent documents having a first priority date before a priority date of the patent document and a second set of additional patent documents having a second priority date after the priority date of the patent document;
determining a first frequency of occurrence of a word of the element of the claim in the first set of additional patent documents;
determining a second frequency of occurrence of the word of the element of the claim in the second set of additional patent documents;
analyzing the first frequency of occurrence of the word with respect to a first threshold frequency of occurrence; and
analyzing the second frequency of occurrence of the word with respect to a second threshold frequency of occurrence, the second threshold frequency of occurrence being greater than the first threshold frequency of occurrence.