US 11,899,807 B2
Systems and methods for auto discovery of sensitive data in applications or databases using metadata via machine learning techniques
Santosh Chikoti, Monroe Township, NJ (US); Jeffrey Kessler, Mahopac, NY (US); Ita B Lamont, North Brunswick, NJ (US); and Saurabh Gupta, Secaucus, NJ (US)
Assigned to JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed by JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed on Aug. 31, 2021, as Appl. No. 17/462,983.
Claims priority of provisional application 63/073,572, filed on Sep. 2, 2020.
Prior Publication US 2022/0067185 A1, Mar. 3, 2022
Int. Cl. G06F 21/62 (2013.01); G06F 40/151 (2020.01); G06F 16/35 (2019.01); G06F 16/383 (2019.01)
CPC G06F 21/6218 (2013.01) [G06F 16/35 (2019.01); G06F 16/383 (2019.01); G06F 40/151 (2020.01)] 19 Claims
OG exemplary drawing
 
1. A method for auto discovery of sensitive data in applications or databases using metadata via machine learning techniques, comprising:
receiving, at data enrichment computer program in a metadata processing pipeline, raw metadata from a plurality of different data sources;
enriching, by the data enrichment computer program, the raw metadata;
converting, by the data enrichment computer program, the raw metadata and the enhanced raw metadata into a sentence structure;
predicting, by a category prediction computer program in the metadata processing pipeline, a predicted category for the sentence structure;
identifying, by a sensitive data mapping computer program, a sensitive data category that is mapped to the predicted category based on a policy mapping rule;
determining, by the sensitive data mapping computer program, a risk classification rating for the predicted category; and
tagging, by the sensitive data mapping computer program, the data source associated with the metadata based on the risk classification rating.