| CPC G06T 11/206 (2013.01) [G06N 20/00 (2019.01)] | 16 Claims |

|
1. A computer-implemented method comprising:
presenting, by one or more processors, a first visualization of a training dataset in a first plot;
responsive to receiving a selection of a data group of the training dataset to analyze, identifying, by the one or more processors, three or fewer key model features of the data group of the training dataset;
ascertaining, by the one or more processors, a representative record of each key model feature of the three or fewer key model features using a Local Interpretable Model-Agnostic Explanation technique;
presenting, by the one or more processors, a second visualization of the three or fewer key model features and the representative record of each key model feature in a second plot;
correcting or completing, by the one or more processors, the training dataset, wherein the training dataset is either incorrect or incomplete;
prior to presenting the first visualization of the training dataset in the first plot, gathering, by one or more processors, the training dataset from one or more sources;
determining, by one or more processors, a degree of importance of the one or more key model features;
ranking, by one or more processors, the one or more key model features according to the degree of importance; and
selecting, by one or more processors, the three or fewer key model features based on a set of criteria, wherein the set of criteria is selected from a group consisting of: a degree of accuracy of each key model feature of the training dataset and a pre-set configuration, further comprises:
selecting, by one or more processors, two key model features of the three or fewer key model features selected; and
condensing, by one or more processors, a key model feature of the three or fewer key model features not selected into a linear combination using a Principle Component Analysis to produce a three-dimension condensed data.
|