CPC G06F 16/215 (2019.01) | 11 Claims |
1. A method for managing data patterns, including:
acquiring multiple sets of data patterns respectively associated with multiple collection devices, wherein the multiple collection devices are located in an edge network in an application environment, and wherein a set of data patterns in the multiple sets of data patterns represent patterns of duplicate data in data from one of the multiple collection devices;
generating, based on the multiple sets of data patterns, multiple pattern features, wherein each one of the pattern features is generated for a respective set of data patterns in the multiple sets of data patterns, and wherein each pattern feature includes a number of occurrences of each individual data pattern in the respective set of data patterns;
dividing the multiple collection devices into multiple groups based on the pattern features;
determining, based on the numbers of occurrences of data patterns included in sets of data patterns associated with collection devices in a group in the multiple groups, a set of shared data patterns for sharing among the collection devices in the group;
distributing the set of shared data patterns to an edge computing device in the edge network, wherein the edge computing device is connected to a target collection device in the multiple collection devices in the group;
instructing the edge computing device to generate de-duplicated data of target data from the target collection device based on the set of shared data patterns, wherein the de-duplicated data is smaller than the target data;
instructing the edge computing device to transmit the de-duplicated data to a server device that is used to process the target data; and
whereby transmission of the de-duplicated data to the server device reduces overhead of storage resources involved in data storage by the server device.
|