US 12,242,710 B2
Natural language processing system and method for documents
Nicholas E. Vandivere, Spring, TX (US); and Michael B. Kuykendall, Spring, TX (US)
Assigned to Thomson Reuters Enterprise Centre GmbH, Zug (CH)
Filed by Thomson Reuters Enterprise Centre GmbH, Zug (CH)
Filed on Dec. 15, 2023, as Appl. No. 18/541,901.
Application 18/541,901 is a continuation of application No. 17/545,662, filed on Dec. 8, 2021, granted, now 11,861,143.
Application 17/545,662 is a continuation of application No. 15/887,689, filed on Feb. 2, 2018, granted, now 11,226,720, issued on Jan. 18, 2022.
Claims priority of provisional application 62/584,527, filed on Nov. 10, 2017.
Claims priority of provisional application 62/573,542, filed on Oct. 17, 2017.
Claims priority of provisional application 62/454,648, filed on Feb. 3, 2017.
Prior Publication US 2024/0111396 A1, Apr. 4, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 3/0482 (2013.01); G06F 3/0484 (2022.01); G06F 16/93 (2019.01); G06F 40/106 (2020.01); G06F 40/30 (2020.01); G06N 20/00 (2019.01)
CPC G06F 3/0482 (2013.01) [G06F 3/0484 (2013.01); G06F 16/93 (2019.01); G06F 40/106 (2020.01); G06F 40/30 (2020.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method for categorizing electronic documents, the method comprising:
receiving, by a processor, a plurality of electronic documents;
associating, by a plurality of trained machine learning models comprising a paragraph model trained to identify one or more categories associated with paragraphs of text and a sentence model trained to identify one or more subcategories of the one or more categories associated with sentences of text, a category and a subcategory for each of the plurality of electronic documents, the one or more categories corresponding to conceptual context of a content of the text;
identifying, by the processor, a conflict between a category and a subcategory associated with a first document of the plurality of electronic documents and a category and a subcategory associated with a second document of the plurality of documents;
removing, based on the identified conflict, an association of the category and the subcategory from the first document of the plurality of electronic documents; and
generating a graphical user interface comprising a navigable document image of the first document and the second document and a list of the associated category and the subcategory within the image of the first document and the second document.