US 12,380,228 B2
Sanitizing personally identifiable information (PII) in audio and visual data
Todd Mozer, Los Altos Hills, CA (US); Pieter Vermeulen, Portland, OR (US); and Jonathan Welch, Douglasville, GA (US)
Assigned to SENSORY, INCORPORATED, Santa Clara, CA (US)
Filed by Sensory, Incorporated, Santa Clara, CA (US)
Filed on Nov. 14, 2022, as Appl. No. 18/055,291.
Application 18/055,291 is a continuation in part of application No. 17/579,383, filed on Jan. 19, 2022, granted, now 12,248,603.
Prior Publication US 2023/0229790 A1, Jul. 20, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 21/00 (2013.01); G06F 21/53 (2013.01); G06F 21/60 (2013.01)
CPC G06F 21/602 (2013.01) [G06F 21/53 (2013.01)] 18 Claims
OG exemplary drawing
 
18. A hardware circuit comprising:
a first set of logic blocks configured to receive an audio or visual (A/V) data sample from an A/V capture module via a secure communication channel;
a second set of logic blocks configured to pre-process the A/V data sample to detect speech or appearance of one or more people in the A/V data sample, the pre-processing being performed while the hardware circuit operates in a lower power mode; and
a third set of logic blocks configured to:
in response to detecting the speech or appearance of one or more people in the A/V data sample:
transition the hardware circuit from the lower power mode to a higher power mode; and
sanitize the A/V data sample while operating in the higher power mode, wherein the sanitizing includes identifying personally identifiable information (PII) in the A/V data sample and removing, obfuscating, and/or transforming the identified PII, wherein the sanitizing results in a sanitized version of the A/V data sample, wherein the sanitizing is performed only in response to detecting the speech or appearance of one or more people in the A/V data sample, and wherein other portions of the computing device are blocked, via a hardware mechanism, from accessing audio or visual data captured by the A/V capture module until the speech or appearance of one or more people in the A/V data sample is detected.