CPC G06F 21/6218 (2013.01) [G06F 16/35 (2019.01); G06F 16/383 (2019.01); G06F 40/151 (2020.01)] | 19 Claims |
1. A method for auto discovery of sensitive data in applications or databases using metadata via machine learning techniques, comprising:
receiving, at data enrichment computer program in a metadata processing pipeline, raw metadata from a plurality of different data sources;
enriching, by the data enrichment computer program, the raw metadata;
converting, by the data enrichment computer program, the raw metadata and the enhanced raw metadata into a sentence structure;
predicting, by a category prediction computer program in the metadata processing pipeline, a predicted category for the sentence structure;
identifying, by a sensitive data mapping computer program, a sensitive data category that is mapped to the predicted category based on a policy mapping rule;
determining, by the sensitive data mapping computer program, a risk classification rating for the predicted category; and
tagging, by the sensitive data mapping computer program, the data source associated with the metadata based on the risk classification rating.
|