US 12,248,572 B2
Methods and apparatus for using machine learning on multiple file fragments to identify malware
Joshua Daniel Saxe, Wichita, KS (US); and Richard Harang, Alexandria, VA (US)
Assigned to Sophos Limited, Abingdon (GB)
Filed by Sophos Limited, Abingdon (GB)
Filed on Mar. 20, 2023, as Appl. No. 18/186,587.
Application 18/186,587 is a continuation of application No. 16/853,803, filed on Apr. 21, 2020, granted, now 11,609,991.
Application 16/853,803 is a continuation of application No. 15/727,035, filed on Oct. 6, 2017, granted, now 10,635,813, issued on Apr. 28, 2020.
Prior Publication US 2023/0229772 A1, Jul. 20, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. H04L 29/06 (2006.01); G06F 21/56 (2013.01); G06N 3/084 (2023.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06F 21/565 (2013.01) [G06F 21/562 (2013.01); G06F 21/563 (2013.01); G06N 3/084 (2013.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving a potentially malicious file;
splitting data of the potentially malicious file into a first set of fragments, each fragment from the first set of fragments having a same first size;
splitting the data of the potentially malicious file into a second set of fragments, each fragment from the second set of fragments having a same second size different than the first size;
providing the first set of fragments to a machine learning model to identify a first fragment, the first fragment being from the first set of fragments and including information potentially relevant to a determination of whether the potentially malicious file is malicious;
providing the second set of fragments to the machine learning model to identify a second fragment, the second fragment being from the second set of fragments and including information potentially relevant to a determination of whether the potentially malicious file is malicious; and
determining, based on a combination of the first fragment and the second fragment identified by the machine learning model, that the potentially malicious file is malicious.