| CPC G06F 16/217 (2019.01) [G06F 16/285 (2019.01)] | 20 Claims |

|
1. A computer-implemented method, comprising operations for:
training a model using an original dataset with data;
classifying the data of the original dataset into clusters;
generating a first visualization dashboard with visualizations, wherein each visualization represents a first data distribution of an associated cluster of the clusters;
assigning a first distribution value to the first data distribution of a particular cluster of the clusters;
fine-tuning the model using a fine-tune dataset;
generating a second visualization dashboard by updating each visualization, wherein each visualization represents a second data distribution of the associated cluster;
assigning a second distribution value to the second data distribution of the particular cluster;
determining that the particular cluster has focus drift that exceeds a threshold based on a difference between the second distribution value and the first distribution value exceeding the threshold, wherein the difference indicates that the second data distribution of the particular cluster is not distinct from one or more data distributions of other clusters based on the threshold;
identifying a subset of data from the data of the original dataset for the particular cluster;
combining the subset of the data from the original dataset with the fine-tune dataset to form a combined dataset;
correcting the focus drift by fine-tuning the model using the combined dataset; and
determining that the corrected focus drift for the particular cluster does not exceed the threshold.
|