US 11,757,901 B2
Malicious homoglyphic domain name detection and associated cyber security applications
Vincent Mutolo, Portsmouth, NH (US); Alexander Chinchilli, Medford, MA (US); Sean Moore, Hollis, NH (US); Matthew Sparrow, Virginia Beach, VA (US); and Connor Tess, Merrimack, NH (US)
Assigned to Centripetal Networks, LLC, Portsmouth, NH (US)
Filed by Centripetal Networks, LLC, Portsmouth, NH (US)
Filed on Sep. 16, 2022, as Appl. No. 17/946,900.
Claims priority of provisional application 63/345,719, filed on May 25, 2022.
Claims priority of provisional application 63/245,074, filed on Sep. 16, 2021.
Prior Publication US 2023/0083949 A1, Mar. 16, 2023
Int. Cl. H04L 9/00 (2022.01); H04L 9/40 (2022.01); H04L 61/4511 (2022.01)
CPC H04L 63/14 (2013.01) [H04L 61/4511 (2022.05); H04L 63/1416 (2013.01); H04L 63/1433 (2013.01); H04L 63/1483 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A computing device for homoglyphic domain name detection, wherein the computing device comprises:
one or more processors; and
memory storing instructions that, when executed by the one or more processors, cause the computing device to:
receive an input domain name for homoglyphic domain name detection;
generate a normalized character string corresponding to the input domain name by applying one or more normalization operations to the input domain name, wherein the one or more normalization operations are configured to reduce homoglyphic characteristics in the input domain name;
generate a plurality of segmentations of the normalized character string, wherein generating each segmentation, of the plurality of segmentations, comprises segmenting the normalized character string into a respective plurality of segments, and wherein each segmentation comprises a different plurality of segments;
select a first segmentation, of the plurality of segmentations, based on cost values corresponding to each respective segmentation determined using a cost function, wherein the cost function is configured to assign a cost value to a given segmentation based on at least one list of known domain names;
compare the selected first segmentation with the at least one list of known domain names to determine whether one or more segments of the selected first segmentation match a base of a known domain name in the at least one list of known domain names;
determine that the input domain name is a homoglyphic domain name based on a determination that the one or more segments of the selected first segmentation match a base of a known domain name in the at least one list of known domain names; and
output, based on the determination that the input domain name is a homoglyphic domain name, an indication that the input domain name has been detected as a homoglyphic domain name, wherein the indication comprises at least one of:
the matched base of the known domain name; or
the one or more segments that match the base of the known domain name.