CPC G06F 16/358 (2019.01) [G06F 40/30 (2020.01); G06F 40/40 (2020.01)] | 20 Claims |
1. A computer-implemented method for controlling and visualizing topic modeling results, the computer-implemented method comprising:
inputting, by a processor, a dataset into a hierarchical topic modeling algorithm configured for hierarchical clustering analysis and natural language processing (NLP) of the dataset;
generating, by the processor, a set of clusters based on a first set of parameters inputted into the hierarchical modeling algorithm, wherein each cluster represents a topic identified from the dataset;
outputting, by the processor, an interactive two-dimensional (2D) spatial distribution of the set of clusters to a user interface, wherein the interactive 2D spatial distribution is obtained through a multidimensional scaling of semantic embeddings, and nodes of the interactive 2D spatial distribution each represent a cluster of the set of clusters and distance between the nodes depicts a level of similarity between topics represented by the nodes;
selecting, by the processor, a first node of the interactive 2D spatial distribution being displayed by the user interface; and
in response to selecting the first node of the interactive 2D spatial distribution, visually generating, by the processor, an individual topic view of the first node based on refining the hierarchical topic modeling via an iterative interaction feedback loop, wherein the individual topic view comprising a semantic summary explaining topic definitions for the first node and structural attributes explaining how the topic of the first node differs from remaining nodes of the 2D spatial distribution.
|