US 11,954,224 B1
Database redaction for semi-structured and unstructured data
Yimeng Li, Bellevue, WA (US); Carl Yates Perry, Burlingame, CA (US); Raghavendran Ramakrishnan, Kirkland, WA (US); Frantisek Rolinek, Seattle, WA (US); and Yunqiao Zhang, Bellevue, WA (US)
Assigned to SNOWFLAKE INC., Bozeman, MT (US)
Filed by SNOWFLAKE INC., Bozeman, MT (US)
Filed on Aug. 29, 2023, as Appl. No. 18/239,527.
Application 18/239,527 is a continuation of application No. 18/304,063, filed on Apr. 20, 2023, granted, now 11,783,078.
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 21/62 (2013.01); G06F 16/28 (2019.01)
CPC G06F 21/6227 (2013.01) [G06F 16/285 (2019.01); G06F 21/6254 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A method comprising:
receiving a masking policy for a column of a database, the masking policy identifying a first category of sensitive data and a second category of sensitive data;
examining the column to identify sensitive data in a first location of the column and a second location of the column, wherein the first location and the second location are within a same row of the column; and
in response to a data query accessing the column, the first location of the column exceeding a threshold probability of comprising sensitive data, executing, by a processing device, to generate redacted data for a response to the data query:
a first redaction operation to redact the first category of sensitive data from the first location of the column;
and a second redaction operation to redact the second category of sensitive data from the second location of the column.
 
8. A system comprising:
a memory; and
a processing device operatively coupled to the memory, the processing device to:
receive a masking policy for a column of a database, the masking policy identifying a first category of sensitive data and a second category of sensitive data;
examine the column to identify sensitive data in a first location of the column and a second location of the column, wherein the first location and the second location are within a same row of the column; and
in response to a data query accessing the column, the first location of the column exceeding a threshold probability of comprising sensitive data, execute, by the processing device, to generate redacted data for a response to the data query:
a first redaction operation to redact the first category of sensitive data from the first location of the column;
and a second redaction operation to redact the second category of sensitive data from the second location of the column.
 
15. A non-transitory computer-readable storage medium including instructions that, when executed by a processing device, cause the processing device to:
receive a masking policy for a column of a database, the masking policy identifying a first category of sensitive data and a second category of sensitive data;
examine the column to identify sensitive data in a first location of the column and a second location of the column, wherein the first location and the second location are within a same row of the column; and
in response to a data query accessing the column, the first location of the column exceeding a threshold probability of comprising sensitive data, execute, by the processing device, to generate redacted data for a response to the data query:
a first redaction operation to redact the first category of sensitive data from the first location of the column; and
a second redaction operation to redact the second category of sensitive data from the second location of the column.