| CPC G06F 40/205 (2020.01) [G06F 40/20 (2020.01); G06F 40/279 (2020.01); G06N 20/00 (2019.01)] | 18 Claims |

|
1. A method for assisted review of a document, the method comprising:
identifying two or more similar reference text segments, from a reference corpus of text content, that are similar to a text segment of the document by:
converting the text segment to a dense vector representation of the text segment after replacing numerical text in the text segment with a corresponding token representing the numerical text;
converting reference text segments from the reference corpus to corresponding dense vector representations of the reference text segments after replacing corresponding numerical text in the reference text segments with corresponding tokens representing the corresponding numerical text;
computing corresponding similarity scores between the dense vector representation of the text segment and the corresponding dense vector representations of the reference text segments using a machine learning model trained using the reference corpus to identify similar text segments; and
subsequent to determining the two or more similar reference text segments based on each corresponding similarity score between the dense vector representation to the corresponding dense vector representations above a threshold level of similarity:
accessing the corresponding numerical text from each of the two or more similar reference text segments; and
determining a computed numerical value from the two or more similar reference text segments based on computing at least one of an average value, a median value, minimum value, or a maximum value of the corresponding numerical text from each of the two or more similar reference text segments; and
providing information for the text segment for display on a user interface, the information including the computed numerical value from the two or more similar reference text segments.
|