| CPC G06Q 50/265 (2013.01) [G06F 16/48 (2019.01); G06F 21/6245 (2013.01); G06F 40/40 (2020.01); G06N 3/08 (2013.01); G06Q 10/0635 (2013.01); G06Q 10/10 (2013.01); G06Q 50/01 (2013.01); G06F 3/0482 (2013.01)] | 17 Claims |

|
1. A computer-implemented method comprising:
identifying, with a natural language processing subsystem, a plurality of entities associated with private information by at least applying a trained machine learning model to a set of unstructured text data received from a graphical interface;
computing, by a scoring subsystem, a privacy score for the text data by identifying connections between the entities, the connections between the entities contributing to the privacy score according to a cumulative privacy risk, accounting for the risk of exposing certain entities together, the privacy score indicating potential exposure of the private information by the set of unstructured text data; and
updating, by a reporting subsystem in real time, the graphical interface to include an indicator distinguishing a target portion of the set of unstructured text data from other portions of the set of unstructured text data, wherein a modification to the target portion changes the potential exposure of the private information indicated by the privacy score,
wherein the machine learning model includes a neural network, the method further comprising training the neural network by:
retrieving, by a training subsystem, first training data for a first entity type associated with privacy risk from a first database;
retrieving, by the training subsystem, second training data for a second entity type associated with privacy risk from a second database; and
training, by the training subsystem, the neural network to identify the first entity type and the second entity type using the first training data and the second training data.
|