CPC H04L 63/1425 (2013.01) [G06N 5/04 (2013.01); G06N 20/00 (2019.01)] | 20 Claims |
1. A system comprising:
one or more computing devices that implement a synthetic data generation system, configured to:
obtain a plurality of observed datapoints in a feature space encoding metadata of hosts;
select an observed datapoint from the plurality of observed datapoints;
select a direction of the synthetic datapoint relative to the observed datapoint in the feature space;
generate a plurality of synthetic datapoints in the direction with increasing distances;
stop the generation of the synthetic datapoints in response to a determination that a probability of observing a last one of the synthetic datapoints is less than a specified threshold; and
add the synthetic datapoints to a dataset, wherein the dataset is used to train or test one or more machine learning models used to analyze the metadata.
|