US 12,277,236 B2
System and methods for intelligent entity-wide data protection
Prashant Ranjan Srivastava, Charlotte, NC (US); Michelle Andrea Boston, Richardson, TX (US); Kamalanathan Jeganathan, Karnataka (IN); Aravind Chandramohan Kumar, Kerala (IN); Miriam Levinsohn, Dallas, TX (US); Annabelle Williamson McLemore, Plano, TX (US); Catharina Rabie, Greenville, TX (US); Ananth Rajagopalan, Scarborough (CA); Parthiban Tiruvayur Shanmugam, Charlotte, NC (US); and Durga P. Turaga, Murphy, TX (US)
Assigned to BANK OF AMERICA CORPORATION, Charlotte, NC (US)
Filed by BANK OF AMERICA CORPORATION, Charlotte, NC (US)
Filed on Oct. 6, 2021, as Appl. No. 17/494,907.
Prior Publication US 2023/0105207 A1, Apr. 6, 2023
Int. Cl. G06F 21/60 (2013.01); G06F 16/906 (2019.01)
CPC G06F 21/604 (2013.01) [G06F 16/906 (2019.01)] 15 Claims
OG exemplary drawing
 
11. A computer implemented method for intelligent entity-wide data classification and protection, said computer implemented method comprising:
providing a computing system comprising a computer processing device and a non-transitory computer readable medium, where the computer readable medium comprises configured computer program instruction code, such that when said instruction code is operated by said computer processing device, said computer processing device performs the following operations:
receiving a data set for analysis, wherein the data set comprises multiple data files;
determining a data type and data format of the data set;
based on a scan of metadata of the data set, determining an associated application identification, storage location, and current classification status for the data set;
performing a sample scan of one or more of the multiple data files of the data set and determine a data field sampling, wherein the sample scan further comprises using affinity matching, context checking, or format matching to identify potentially sensitive information within the data fields that was not identified via the scan of the metadata of the data set;
performing a full scan of the data set resulting from the simple scan and determine a classification of the data fields in each of the multiple data files via a machine learning engine;
utilizing a three-layered discovery method comprising a first pass metadata scan, a quick scan, and a deep scan to detect identifying characteristics of private data characteristics;
based on the classification of the data fields in each of the multiple data files, determining one or more protection requirements and labelling the data fields with corresponding privacy levels;
retrieving the associated application identification and generate a report of classifications and protection requirements for the application identification;
transmitting the report to one or more user devices via one or more channels of communication; and
applying field-level encryption techniques to implement one or more protection requirements to protect sensitive data based on user access levels, wherein the one or more protection requirements further comprise a determination as to whether the data fields should be redacted, obfuscated, partially obfuscated, or encrypted according to one or more entity policies.