US 11,853,339 B2
Techniques and components to find new instances of text documents and identify known response templates
Joerg Rings, Chicago, IL (US); William Thomas Romano, Riverview, FL (US); and Andre Gatorano, Palatine, IL (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Feb. 17, 2022, as Appl. No. 17/674,444.
Application 17/674,444 is a division of application No. 16/706,270, filed on Dec. 6, 2019, granted, now 11,288,300.
Application 16/706,270 is a division of application No. 16/536,993, filed on Aug. 9, 2019, granted, now 10,540,381, issued on Jan. 21, 2020.
Prior Publication US 2022/0171799 A1, Jun. 2, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/35 (2019.01); G06F 16/93 (2019.01); G06F 16/33 (2019.01); G06V 30/418 (2022.01); G06F 18/22 (2023.01); G06V 30/40 (2022.01); G06V 10/762 (2022.01); G06V 30/10 (2022.01)
CPC G06F 16/353 (2019.01) [G06F 16/334 (2019.01); G06F 16/93 (2019.01); G06F 18/22 (2023.01); G06V 10/762 (2022.01); G06V 30/40 (2022.01); G06V 30/418 (2022.01); G06V 30/10 (2022.01)] 16 Claims
OG exemplary drawing
 
1. A system, comprising:
a document repository, comprising a memory, storing a document corpus containing a plurality of incoming documents grouped in a respective cluster of a plurality of clusters based on a common theme and a plurality of document templates including response documents sharing the common theme to be sent in response to at least one of the incoming documents; and
a document analytics component including a processor and communication interface coupled to the document repository, wherein the processor of the document analytics component is operable to:
receive an incoming document;
access the document corpus in the document repository; analyze the document corpus with reference to a common theme of the incoming document; and
based on results of the analyzing, select a response document template of a response document that shares the common theme of the incoming document;
determine a common theme of the incoming document in one respective cluster of the plurality of clusters, wherein the determining the common theme of the incoming document in one respective cluster of the plurality of clusters, is further operable to:
compare an intrinsic similarity value for each cluster in the plurality of clusters, wherein the intrinsic similarity value is based on a mean value of a cosine similarity between pairs of documents in the document corpus; and
based on the intrinsic similarity value of each respective cluster in the plurality of clusters exceeding an intrinsic similarity value threshold, remove the respective cluster from the document corpus as a candidate cluster.