CPC G06Q 10/0635 (2013.01) [G06F 16/90335 (2019.01); G06F 16/95 (2019.01); G06F 21/6209 (2013.01); G06F 40/30 (2020.01); G06Q 10/067 (2013.01); G06Q 10/103 (2013.01); G06Q 30/01 (2013.01); G06Q 30/0282 (2013.01)] | 16 Claims |
1. A computer implemented method, comprising:
connecting, by one or more computing devices, a scraping interface to an enforcement information source;
scraping, by the scraping interface, enforcement information from the enforcement information source;
applying, by the one or more computing devices, a regular expression search to the enforcement information from the enforcement information source to identify first citation data formatted according to a regular form for regulatory citations within the enforcement information;
determining, by the one or more computing devices, a classification for the enforcement information associated with the first citation data;
classifying, by the one or more computing devices, a risk information document to determine a classification for the risk information document using phrase-based scoring on the contents of the risk information document, wherein the classification for the risk information document is associated with second citation data conforming to the regular form for regulatory citations;
comparing, by the one or more computing devices, the first citation data against the second citation data of the risk information document to determine a correspondence between the classification for the enforcement information and the classification for the risk information document;
storing, by the one or more computing devices, the enforcement information in association with the risk information document, based on a match between the first citation data and the second citation data; and
presenting in an interface, by the one or more computing devices, a plurality of discrepancies, wherein the plurality of discrepancies includes a discrepancy based on the association of the enforcement information with the risk information document; and
permitting, by the one or more computing devices, arrangement of the plurality of discrepancies on the interface according to a risk level.
|