| CPC H04L 63/1483 (2013.01) [G06F 40/284 (2020.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] | 20 Claims |

|
1. One or more computer storage media having computer-executable instructions embodied thereon that, when executed, by one or more processors, causes the one or more processors to perform a method comprising:
generating a set of Uniform Resource Locator (URL) tokens based on a URL;
generating a set of metadata tokens based on metadata associated with the URL;
generating a set of feature tokens based on the set of URL tokens, the set of metadata tokens, and a set of separator tokens by at least concatenating the set of URL tokens and the set of metadata tokens into the set of feature tokens including a first separator token from the set of separator tokens between a first metadata token of the set of metadata tokens and a second metadata token, wherein the first separator token indicates a type of metadata associated with the second metadata token;
providing the set of feature tokens as a single input vector to a transformer model;
obtaining an output of the transformer model including an embedding vector;
determining a decision statistic based on the embedding vector; and
as a result of the decision statistic indicating the URL is malicious, causing a remedial action to be performed, where the remedial action prevents a computing device from accessing the URL.
|