US 11,775,684 B2
Rule-based document scrubbing of sensitive data
Brian Boon, Redmond, WA (US); Dinesh Chandnani, Sammamish, WA (US); Zhu Chen, Redmond, WA (US); Ram Kumar Donthula, Redmond, WA (US); Matthew Sloan Theodore Evans, Kindred, ND (US); Andrew Neil, Seattle, WA (US); Vijaya Upadya, Sammamish, WA (US); Geoffrey Staneff, Woodinville, WA (US); Shibani Basava, Seattle, WA (US); Evgenia Steshenko, Seattle, WA (US); Carl Brochu, Renton, WA (US); Shaun Miller, Sammamish, WA (US); and Xin Shi, Kirkland, WA (US)
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed by MICROSOFT TECHNOLOGY LICENSING, LLC., Redmond, WA (US)
Filed on Aug. 16, 2022, as Appl. No. 17/888,908.
Application 17/888,908 is a continuation of application No. 16/408,143, filed on May 9, 2019, granted, now 11,449,635.
Claims priority of provisional application 62/672,071, filed on May 16, 2018.
Prior Publication US 2022/0391538 A1, Dec. 8, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 21/62 (2013.01); H04L 9/06 (2006.01); H04L 9/40 (2022.01); G06F 21/60 (2013.01)
CPC G06F 21/6254 (2013.01) [G06F 21/6245 (2013.01); H04L 9/0643 (2013.01); H04L 63/0421 (2013.01); G06F 21/602 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and a memory;
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs include instructions that:
obtain a plurality of documents having telemetric data and sensitive data, a first set of the plurality of documents having a plurality of fields arranged in a first format, a second set of the plurality of documents having a plurality of fields arranged in a second format, wherein the first format and the second format differ;
access a script having a plurality of rules, a rule identifying a select one of the plurality of fields of a select one of the plurality of documents of a specific format as sensitive data and including a scrubbing action;
apply the plurality of rules of the script to each of the plurality of documents to identify sensitive data and to associate a scrubbing action for the identified sensitive data;
tag each of the plurality of documents with a tag indicating the scrubbing action from the application of the plurality of rules;
aggregate select ones of the plurality of documents tagged with a similar tag;
perform a select scrubbing action associated with the similar tag to each of the selected aggregated documents; and
process the telemetric data without the sensitive data.