US 12,259,930 B2
System and method for automated file reporting
Leo Zovic, North York (CA); Connor Atchison, Elora (CA); Luke Boudreau, Rockville (CA); Wei Sun, Kitchener (CA); Ryan Jugdeo, Toronto (CA); and Erik Derohanian, Toronto (CA)
Assigned to WISEDOCS INC., Toronto (CA)
Appl. No. 17/616,451
Filed by WISEDOCS INC., Toronto (CA)
PCT Filed Jun. 5, 2020, PCT No. PCT/CA2020/050782
§ 371(c)(1), (2) Date Dec. 3, 2021,
PCT Pub. No. WO2020/243846, PCT Pub. Date Dec. 10, 2020.
Claims priority of provisional application 62/857,930, filed on Jun. 6, 2019.
Prior Publication US 2022/0237230 A1, Jul. 28, 2022
Int. Cl. G06F 16/906 (2019.01); G06F 16/901 (2019.01); G06F 16/93 (2019.01)
CPC G06F 16/906 (2019.01) [G06F 16/901 (2019.01); G06F 16/93 (2019.01)] 24 Claims
OG exemplary drawing
 
1. A document summary generating system comprising:
at least one processor; and
a memory storing a sequence of instructions which when executed by the at least one processor configures the at least one processor to:
obtain a document;
divide the document into chunks of content;
encode each chunk of the chunks of content to obtain encoded chunks of content;
cluster the encoded chunks of content into clusters of encoded chunks;
determine at least one central encoded chunk in each cluster of the clusters of encoded chunks;
generate a summary for the document based on the at least one central encoded chunk for each cluster of the clusters of encoded chunks;
determine a similarity score between a ground truth graph associated with the document and a predicted graph associated with the document, the at least one processor configured to:
obtain ground truth data with manually applied labels;
generate a graph for the ground truth data with manually applied labels;
generate a predicted graph using predicted attributes associated with the document, the at least one processor configured to:
obtain classified pages and unclassified pages from the document using a known document classifier;
extract known attributes from the classified pages using a document type classifier; and
extract the predicted attributes from the unclassified pages using a page classifier; and
determine a graph edit distance between the generated graph for the ground truth data and the predicted graph.