US 12,204,645 B1
	Machine learning model evaluation and comparison
MohamadAli Torkamani, Scarsdale, NY (US); Bhavna Soman, Seattle, WA (US); Jeffrey Earl Bickford, Thornton, CO (US); and Baris Coskun, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Nov. 16, 2021, as Appl. No. 17/528,019.
Int. Cl. G06F 21/56 (2013.01); G06F 21/57 (2013.01); G06N 20/20 (2019.01)

CPC G06F 21/57 (2013.01) [G06F 21/566 (2013.01); G06N 20/20 (2019.01)]

20 Claims

1. A method, comprising:

defining a marker, that if detected in a data item, is indicative that the data item is a malicious data item;

for each of a plurality of data items of a data set:

determining a marker score for the marker; and

determining a data item marker score based at least in part on the marker score;

determining, for a first machine learning model trained to detect malicious data items, a first sub-plurality of data items of the plurality of data items;

determining, for the first sub-plurality of data items and based at least in part on the data item marker scores determined for each of the first sub-plurality of data items, a first model marker score;

determining, for a second machine learning model trained to detect malicious data items, a second sub-plurality of data items of the plurality of data items;

determining, for the second sub-plurality of data items and based at least in part on the data item marker scores determined for each of the second sub-plurality of data items, a second model marker score;

determining, based at least in part on the first model marker score and the second model marker score, that the second machine learning model performs better at detecting malicious data items than the first machine learning model; and

selecting the second machine learning model to process data items to detect malicious data items.