| CPC G06F 16/35 (2019.01) [G06F 16/38 (2019.01); G06F 40/279 (2020.01)] | 23 Claims |

|
1. A system comprising:
one or more memory devices storing processor-executable instructions; and
one or more processors configured to execute instructions to cause the system to perform:
extracting, from a database, first information associated with a first instance of content and second information associated with a first person, wherein the first person is a creator of the first instance of the content and wherein the second information includes a classification of the first person;
cleaning the extracted first information based on contextual information associated with the first instance of the content by calculating relation distances using the contextual information between a first keyword and a second keyword distinct from the first keyword in the extracted first information;
classifying the cleaned first information into a first set of categories in the database;
determining a second set of categories based on the second information associated with the first person, wherein the second information associated with the first person includes the cleaned first information associated with the first instance of the content;
aggregating the second information associated with the first person using the second set of categories by generating embeddings of one or more keywords of the second information based on context similarity and semantic similarity, wherein the one or more keywords of the second information are grouped into categories, and wherein the context similarity is based on a similarity of the first person associated with the second information and a second person associated with the second information;
determining a third set of categories based on third information associated with a group of people including the first person, wherein the third information includes the second information associated with the first person; and
generating data for the third information associated with the group of people by determining frequency data associated with the third information, wherein the frequency data is determined based on the first set of categories, the second set of categories, and the third set of categories.
|