| CPC H04L 63/1416 (2013.01) | 25 Claims |

|
1. A method comprising:
providing, by one or more processors, first text data to a trained large language model (LLM) to identify first data associated with a first candidate cybersecurity event experienced by an entity;
comparing, by the one or more processors, an identifier of the entity extracted from the first text data to domain information to identify a verified identifier of the entity, wherein the domain information indicates domains and names associated with a plurality of entities, and wherein the verified identifier includes a verified name of the entity, a verified domain of the entity, or a combination thereof;
determining, by the one or more processors, that the first candidate cybersecurity event represents a new cybersecurity event for the entity based on the first data and previous data corresponding to a previous cybersecurity event associated with the verified identifier, wherein determining that the first candidate cybersecurity event represents the new cybersecurity event comprises:
determining, by the one or more processors, whether a first similarity measure between the first data and the previous data is less than a predetermined similarity threshold; and
determining, by the one or more processors, that the first candidate cybersecurity event represents the new cybersecurity event for the entity in response to determining that the first similarity measure is less than the predetermined similarity threshold; and
updating, by the one or more processors, a cybersecurity risk score associated with the verified identifier of the entity based on the first data in response to determining that the first candidate cybersecurity event represents the new cybersecurity event.
|