| CPC G06Q 40/02 (2013.01) [G06F 18/2148 (2023.01); G06F 18/23213 (2023.01); G06N 20/00 (2019.01)] | 8 Claims |

|
1. A method performed by a data management system, comprising:
analyzing a plurality of data sets with an analysis model comprising a first submodel and a second submodel that use different types of machine learning processes, each data set including data downloaded by the data management system from an institution computing environment via a network from an account of one or more users of the data management system via a first data acquisition process and a new data acquisition process that is under test, wherein data downloaded via the new data acquisition process is the same as data downloaded via the first data acquisition process if the new data acquisition process correctly obtains details of the corresponding account, wherein data downloaded via the first data acquisition process has different structures than data downloaded via the new data acquisition process due to differences in the first data acquisition process and the new data acquisition process, wherein account features and transaction features are extracted from the data downloaded and is included in the plurality of data sets; wherein the first submodel of the analysis model is trained to determine whether the account features in the data having different data structures contain the same data, and the second submodel of the analysis model is trained to determine whether the transaction features in the data having different data structures contain the same data;
determining, with the analysis model for each data set, whether the new data acquisition process correctly obtains details for the corresponding account based on data downloaded via the new data acquisition process containing the same data as data downloaded via the first data acquisition process;
generating groups of data sets by performing, with a clustering model, a clustering algorithm on the data sets for which the new data acquisition process is not correctly obtaining details for the corresponding account; and
troubleshooting the new data acquisition process by selecting one or more data sets from each group and investigating the new data acquisition process for the one or more data sets selected from each group.
|