US 12,443,788 B2
Automated document harvesting and regenerating by crowdsourcing in enterprise social networks
James William Murdock, IV, Amawalk, NY (US); Radha Mohan De, Howrah (IN); Sneha Srinivasan, San Jose, CA (US); Mary Diane Swift, Rochester, NY (US); and Caesar Chatterjee, Kolkata (IN)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Dec. 6, 2022, as Appl. No. 18/062,344.
Prior Publication US 2024/0184981 A1, Jun. 6, 2024
Int. Cl. G06F 40/186 (2020.01); G06F 40/30 (2020.01); G06V 30/412 (2022.01)
CPC G06F 40/186 (2020.01) [G06F 40/30 (2020.01); G06V 30/412 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a memory configured to store computer executable components; and
a processor configured to execute at least one of the computer executable components that:
trains, using a set of training data, a machine learning model to generate content for data fields of a document template for a topic, wherein the machine learning model comprises a respective algorithm for each of the data fields, and wherein the set of training data comprises:
the document template for the topic,
training documents that comply with the document template for the topic, and
at least one of documents, emails, or correspondence associated with the topic communicated between people in an enterprise associated with the document template;
generate, using the machine learning model, a machine generated document based on the document template, wherein at least one data field of the data fields of the machine generated document contains respective content generated by the machine learning model;
monitors respective edits, made by one or more of the people, to the respective content in the at least one data field of the machine generated document;
determines, from the respective edits, respective accuracies of the algorithms associated with the at least one data field to generate the respective content; and
based on determining that a respective accuracy for at least one of the algorithms does not meet a threshold accuracy:
generates an enterprise social graph of the people in the enterprise,
retrains, using the set of training data and the enterprise social graph, the machine learning model to generate the content for the data fields of the document template for the topic, wherein the retraining is based on respective distances in the enterprise social graph between first people mentioned in the set of training data associated with respective data fields associated with the at least one of the algorithms that does not meet the threshold accuracy and second people that edited the respective data fields associated with the at least one of the algorithms that does not meet the threshold accuracy.