| CPC G06F 11/3051 (2013.01) [G06F 8/62 (2013.01); G06F 9/4411 (2013.01); G06F 9/44505 (2013.01); G06F 40/284 (2020.01)] | 20 Claims |

|
1. A non-transitory computer-readable data storage medium storing program code executable by a processor to perform processing comprising:
respectively tokenizing a plurality of strings of a text file representing a configuration of a target device into a plurality of tokens for the configuration;
shingling the tokens for the configuration;
determining a target device signature representing the configuration of the target device by applying a min-wise independent permutations locality sensitive hashing (MinHash) technique to the tokens as have been shingled; and
identifying whether the configuration of the target device is anomalous based on the target device signature.
|
|
13. A computing device comprising:
a processor; and
a memory storing program code executable by the processor to:
for each of a plurality of devices, respectively tokenize a plurality of strings of a text file representing a configuration of the device into a plurality of tokens for the configuration;
for each of the plurality of devices, shingle the tokens for the configuration of the device;
for each of the plurality of devices, determine a device signature representing the configuration of the device by applying a min-wise independent permutation locally sensitive hashing (MinHash) technique to the tokens for the device signature of the device as have been shingled;
perform a locality-sensitive hashing (LSH) technique on the device signatures to assign the device signatures within a plurality of hash buckets;
cluster the devices within a plurality of clusters based on assignment of the device signatures within the plurality of hash buckets; and
for each of the plurality of devices, identify whether the configuration of the device is anomalous based on which of the clusters within which the device has been clustered.
|
|
18. A method comprising:
respectively tokenizing, by a processor, a plurality of strings of a text file representing a configuration of a target device into a plurality of tokens for the configuration;
shingling, by the processor, the tokens for the configuration;
generating, by the processor, a target device signature representing the configuration of the target device by applying a min-wise independent permutations locality sensitive hashing (MinHash) technique to the tokens as have been shingled;
comparing, by the processor, the target device signature representing the configuration of the target device to a reference device signature representing a reference configuration of a reference device to calculate a similarity score indicative of how similar the configuration of the target device is to the reference configuration; and
identifying, by the processor, whether the configuration of the target device is anomalous based on the similarity score.
|