CPC G06F 16/353 (2019.01) [G06F 16/334 (2019.01); G06F 16/93 (2019.01); G06F 18/22 (2023.01); G06V 10/762 (2022.01); G06V 30/40 (2022.01); G06V 30/418 (2022.01); G06V 30/10 (2022.01)] | 16 Claims |
1. A system, comprising:
a document repository, comprising a memory, storing a document corpus containing a plurality of incoming documents grouped in a respective cluster of a plurality of clusters based on a common theme and a plurality of document templates including response documents sharing the common theme to be sent in response to at least one of the incoming documents; and
a document analytics component including a processor and communication interface coupled to the document repository, wherein the processor of the document analytics component is operable to:
receive an incoming document;
access the document corpus in the document repository; analyze the document corpus with reference to a common theme of the incoming document; and
based on results of the analyzing, select a response document template of a response document that shares the common theme of the incoming document;
determine a common theme of the incoming document in one respective cluster of the plurality of clusters, wherein the determining the common theme of the incoming document in one respective cluster of the plurality of clusters, is further operable to:
compare an intrinsic similarity value for each cluster in the plurality of clusters, wherein the intrinsic similarity value is based on a mean value of a cosine similarity between pairs of documents in the document corpus; and
based on the intrinsic similarity value of each respective cluster in the plurality of clusters exceeding an intrinsic similarity value threshold, remove the respective cluster from the document corpus as a candidate cluster.
|