| CPC H04L 63/1416 (2013.01) [H04L 41/16 (2013.01); H04L 63/1433 (2013.01)] | 19 Claims |

|
1. A system to identify cyber threat intelligence from a group of information comprising:
a processing subsystem hosted on a server and configured to execute on a network to control bidirectional communications among a plurality of modules comprising:
a data sourcing module operatively coupled to an integrated database, wherein the data sourcing module is configured to fetch the group of information from one or more web sources;
a data processing module operatively coupled to the data sourcing module, wherein the data processing module is configured to:
segregate the group of information fetched by the data sourcing module into one or more corresponding datatypes;
extract textual information from the group of information segregated into the one or more corresponding datatypes, wherein the textual information comprises at least one of a structured text, and an unstructured text;
a machine learning module operatively coupled to the data processing module, wherein the machine learning module comprises:
an entity analysis module operatively coupled to the data processing module, wherein the entity analysis module is configured to:
fragment the textual information extracted by the data processing module to obtain one or more entities comprising at least one of a noun, noun phrase, verb, verb phrase, adjective, and adjective phrase;
assign a label to each of the one or more entities obtained upon comparing the one or more entities with one or more corresponding datasets stored in the integrated database;
analyze the label assigned to each of the one or more entities to generate a first threat score, wherein the first threat score is indicative of a status of the textual information comprising at least one of a threat and a non-threat,
wherein the entity analysis module is trained by one or more machine learning techniques to classify the one or more entities into one or more categories comprising at least one of the threat and the non-threat, wherein the one or more machine learning techniques comprises:
provide one or more textual information comprising a corresponding label to the entity analysis module, wherein the one or more textual information comprises one or more corresponding entities;
extract the one or more corresponding entities from the one or more corresponding textual information provided;
assign the corresponding label of the one or more corresponding textual information to each of the one or more corresponding entities extracted;
calculate a threshold value corresponding to the one or more entities extracted based on the corresponding label assigned; and
classify the one or more entities into one or more categories based on the corresponding threshold value calculated;
a semantic analysis module operatively coupled to the entity analysis module, wherein the semantic analysis module is configured to:
summarize the one or more entities without altering a collective meaning of the one or more entities to obtain a summarized text;
evaluate one or more sentiments pertaining to the summarized text by performing one or more sentiment analysis techniques;
analyze the one or more sentiments evaluated to generate a second threat score wherein the second threat score is indicative of a status of the one or more sentiments evaluated, wherein the status comprises at least one of the threat and the non-threat; and
a classifier module operatively coupled to the semantic analysis module and the entity analysis module, wherein the classifier module is configured to classify the textual information extracted by the data processing module into one or more categories comprising at least one of the threat and the non-threat, thereby identifying the cyber threat intelligence from the group of information.
|