| CPC G06F 11/3476 (2013.01) [G06F 11/3428 (2013.01); G06F 18/24323 (2023.01); G06N 20/20 (2019.01); G06N 3/0442 (2023.01); G06N 20/00 (2019.01)] | 20 Claims |

|
1. A method, comprising:
accessing input data comprising data elements from logs that identify user problems experienced with computing system components, the data elements each being associated with a respective original class label that identifies a class of computing system components to which the data element relates, the respective original class labels forming a group of class labels, and a first one of the original class labels is overrepresented in the group;
reducing the overrepresentation of the first original class label in the group by creating an arbitrary aggregation of some of the class labels that includes the first original class label;
building a hierarchical classification modelling structure configured to classify the input data using the aggregation, and also using one of the original class labels;
creating, based on a configuration of the hierarchical modeling structure, prepared data in which one or more of the original class labels is replaced by the aggregation;
training, using the prepared data, a hierarchical model that is included in the hierarchical classification modeling structure;
training a benchmark model using the original class labels;
collecting classification performance metrics of the benchmark model and of the hierarchical model;
generating a prediction, using the hierarchical model, to obtain a first predicted label;
generating a prediction, using the benchmark model, to obtain a second predicted label; and
comparing, based on the first predicted label and the second predicted label, the classification performance metrics of the benchmark model with the classification performance metrics of the hierarchical model.
|