US 11,860,920 B2
System and method for providing technology assisted data review with optimizing features
Duane George, Orangevale, CA (US); and Douglas Wayne Stewart, San Francisco, CA (US)
Assigned to OPEN TEXT HOLDINGS, INC., Menlo Park, CA (US)
Filed by OPEN TEXT HOLDINGS, INC., Menlo Park, CA (US)
Filed on Dec. 19, 2022, as Appl. No. 18/084,289.
Application 18/084,289 is a continuation of application No. 17/313,445, filed on May 6, 2021, granted, now 11,562,012.
Application 17/313,445 is a continuation of application No. 16/213,665, filed on Dec. 7, 2018, granted, now 11,030,230, issued on Jun. 8, 2021.
Application 16/213,665 is a continuation of application No. 15/849,375, filed on Dec. 20, 2017, granted, now 10,191,977, issued on Jan. 29, 2019.
Application 15/849,375 is a continuation of application No. 14/190,980, filed on Feb. 26, 2014, granted, now 9,886,500, issued on Feb. 6, 2018.
Claims priority of provisional application 61/780,601, filed on Mar. 13, 2013.
Prior Publication US 2023/0121279 A1, Apr. 20, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/34 (2019.01); G06F 16/93 (2019.01)
CPC G06F 16/345 (2019.01) [G06F 16/93 (2019.01)] 21 Claims
OG exemplary drawing
 
1. A method, comprising:
controlling automated assisted review of a plurality of documents within a data store of a document system by:
initiating generation of a document map for the plurality of documents using a topic-related generative model for the plurality of documents;
initiating transmission of a control set of documents from the plurality of documents to a user, the control set of documents determined based on selected documents, based on selecting the selected documents from a first strata of the plurality of documents and a second strata of the plurality of documents; and
based on a control set metric regarding the control set of documents provided by the user, the control set metric including an indicator of responsiveness for each of the documents of the control set of documents, initiating:
determining a responsiveness score for each of the plurality of documents based on a scoring algorithm;
determining a set of responsive documents and a set of non-responsive documents of the plurality of documents based on the responsiveness score determined for each of the plurality of documents and a decision boundary score;
determining a confidence score for the document system using the responsiveness score for each of the documents of the control set and the indicator of responsiveness for each of the control set documents provided by the user;
selecting one or more of the plurality of documents based on the responsiveness scores of the plurality of documents, wherein the responsiveness score of each of the one or more selected documents is at or near the decision boundary score;
providing a presentation of the one or more selected documents to the user;
based on an indicator of responsiveness by the user for each of the selected documents, refining the scoring algorithm based on the indicator of responsiveness for each of the selected document; and
generating a desired confidence score for the document system, and providing a presentation of the set of responsive documents to the user when the desired confidence score for the document system is achieved, wherein the confidence score for the document system is an F1 score determined based on a comparison of the responsiveness score for the documents of the control set with the indicator of responsiveness for the documents of the control set received from the user.