US 12,306,938 B2
Spurious-data-based detection related to malicious activity
Galen Rafferty, Mahomet, IL (US); Samuel Sharpe, Cambridge, MA (US); Brian Barr, Schenectady, NY (US); Jeremy Goodsitt, Champaign, IL (US); Michael Davis, Arlington, VA (US); Taylor Turner, Richmond, VA (US); Justin Au-Yeung, Somerville, MA (US); and Owen Reinert, Queens, NY (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Feb. 16, 2023, as Appl. No. 18/170,502.
Prior Publication US 2024/0281525 A1, Aug. 22, 2024
Int. Cl. G06F 21/00 (2013.01); G06F 21/55 (2013.01); G06F 21/56 (2013.01)
CPC G06F 21/554 (2013.01) [G06F 21/566 (2013.01); G06F 2221/034 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system for using spurious data samples in a dataset to determine a time window during which a malicious device caused a cybersecurity incident, the system comprising:
one or more processors; and
a non-transitory, computer readable medium having instructions recorded thereon that, when executed by the one or more processors, cause operations comprising:
obtaining a first dataset comprising a set of original data samples and a first set of spurious data samples, wherein spurious data samples of the first set of spurious data samples are stored at locations, identifiable by a key, within the first dataset, wherein the first set of spurious data samples are configured to decrease accuracy of a machine learning model by more than a threshold percentage amount;
based on a time period expiring, replacing the first set of spurious data samples in the first dataset with a second set of spurious data samples;
obtaining an indication that a second dataset is available via a third-party computing device;
determining that a subset of samples of the second dataset match the first set of spurious data samples;
based on the subset of samples of the second dataset matching the first set of spurious data samples determining a time window in which a cybersecurity incident occurred, wherein the time window corresponds to a time before the first set of spurious data samples were replaced with the second set of spurious data samples; and
outputting an indication of the time window.