| CPC G06F 16/162 (2019.01) [G06F 9/546 (2013.01); G06F 16/164 (2019.01); G06F 16/1734 (2019.01); G06F 16/23 (2019.01); G06F 21/6254 (2013.01)] | 20 Claims |

|
1. A method of operating a data access system for data processing environments comprising multiple application services and multiple storage services, the method comprising:
receiving, from a user, a request of a data treatment for anonymizing or removing user data;
obtaining a staleness tolerance requirement for performing the data treatment based on a privacy policy;
maintaining a list of data instances in a set of files of a table, wherein a data instance is associated with the request of the data treatment;
identifying one or more files, stored in a storage service of the multiple storage services, needing the data treatment for anonymizing or deleting one or more data instances in a file of the one or more files based on the request, wherein the one or more files are associated with the data instances in the list, wherein the storage service includes an immutable storage system;
determining whether a staleness of the file meets the staleness tolerance requirement for performing the data treatment;
in response to the staleness of the file meeting the tolerance requirement,
adding the file to a queue maintained in the data access system by a planner, wherein only the planner adds to the queue when the file is added to the queue, wherein the queue is shared by a plurality of workers in the data access system and queues tasks for the workers for performing the data treatment, each file in the queue is accessible by each of the plurality of workers, and each worker proceeds to dequeue the tasks from the queue sequentially;
prioritizing the file over a second file in the queue, the file having a sooner deadline based on the staleness tolerance requirement than a second deadline of the second file based on a second staleness tolerance requirement for the second file, wherein the deadline for the file is a deadline to perform the data treatment within the staleness tolerance requirement specified in the privacy policy;
after adding the queue with the file, dequeuing the file from the queue and treating the file by at least one worker of the plurality of workers, wherein treating the file includes the at least one worker anonymizing or deleting the file to create a new file; and
replacing a previous version of the file in the storage service with the treated new file.
|