US 12,287,782 B2
Sensitive data discovery for databases
Christopher Robert Lumnah, North Providence, RI (US); Frank Schwaak, Recklinghausen (DE); Ganesa Sankar Balabharathi, San Ramon, CA (US); and Michael Patrick Oglesby, Indianapolis, IN (US)
Assigned to Rubrik, Inc., Palo Alto, CA (US)
Filed by Rubrik, Inc., Palo Alto, CA (US)
Filed on Mar. 25, 2022, as Appl. No. 17/705,174.
Prior Publication US 2023/0306129 A1, Sep. 28, 2023
Int. Cl. G06F 16/245 (2019.01)
CPC G06F 16/245 (2019.01) 19 Claims
OG exemplary drawing
 
1. A method, comprising:
transmitting, by a data management system, a request that a database management system for a database provide a set of metadata attributes for structured data within the database;
receiving, at the data management system based at least in part on transmitting the request, the set of metadata attributes for the structured data within the database;
performing, by the data management system, a pattern matching procedure to evaluate the set of metadata attributes for the structured data within the database against one or more patterns associated with a data type;
determining, by the data management system and based at least in part on the pattern matching procedure, that a plurality of locations within the database each comprise structured data of the data type;
outputting, by the data management system, a set of classification results for the database, the set of classification results indicating that the plurality of locations within the database each comprise structured data of the data type;
storing the set of classification results, wherein the set of classification results indicates that the plurality of locations within the database each comprise structured data of the data type;
receiving, by the data management system after outputting the set of classification results for the database, an indication that a first location included in the plurality of locations does not store structured data of the data type;
modifying, by the data management system, the set of classification results in response to receiving the indication that the first location included in the plurality of locations does not store structured data of the data type;
storing the modified set of classification results, wherein the modified set of classification results do not indicate that the first location stores structured data of the data type and indicate that one or more other locations included in the plurality of locations store structured data of the data type;
receiving, after storing the modified set of classification results, a request to restore a version of the database based at least in part on a snapshot of the database; and
performing a restoration procedure for the database in response to the request to restore the version of the database, wherein performing the restoration procedure for the database comprises:
masking data for the one or more other locations included in the plurality of locations based at least in part on the modified set of classification results indicating that one or more other locations included in the plurality of locations store structured data of the data type; and
refraining from masking data for the first location based at least in part on the modified set of classification results not indicating that the first location stores structured data of the data type.