| CPC G06N 20/20 (2019.01) [G06F 17/18 (2013.01); G06F 18/217 (2023.01); G06F 18/2415 (2023.01); G06N 5/01 (2023.01); G06N 20/00 (2019.01); G06V 10/751 (2022.01)] | 17 Claims |

|
1. A system comprising:
a data repository storing data samples having values of variables for input to a machine-learning model for risk assessment for an entity,
an external-facing subsystem configured for preventing a host server system from accessing the data repository via a data network, and
an evaluation system configured for:
accessing (a) an estimated dataset having a set of estimated values of an attribute that is a continuous variable, the estimated dataset generated by applying the machine-learning model to an input dataset of the data samples and (b) a validation dataset having a set of validation values of the attribute, the set of validation values respectively being known values corresponding to the set of estimated values generated by the machine-learning model,
generating, from a comparison of the estimated dataset and the validation dataset to an outcome of interest, a discretized evaluation dataset with data values in multiple categories, the discretized evaluation dataset comprising a set of categories in a classification matrix and a number of instances in each category, the set of categories including a true positive category, a true negative category, a false positive category, and a false negative category,
computing, for the machine-learning model, an evaluation metric based on a comparison of data values from different categories of the discretized evaluation dataset, the evaluation metric indicating an accuracy of the machine-learning model, and
providing the host server system with access to (a) the evaluation metric or (b) a modeling output generated with the machine-learning model which indicates a risk level associated with the entity, causing the host server system to allow or prevent the entity to access to a restricted function of a computing environment, based on the modeling output, wherein generating the discretized evaluation dataset comprises:
identifying a first category for the discretized evaluation dataset indicating a match between estimated attribute values and validation attribute values with respect to the outcome of interest;
identifying a second category for the discretized evaluation dataset indicating a mismatch between estimated attribute values and validation attribute values with respect to the outcome of interest;
determining, from the comparison of the estimated dataset and the validation dataset to the outcome of interest, a number of matches in the first category and a number of mismatches in the second category; and
outputting the discretized evaluation dataset having the first category with the number of matches and the second category with the number of mismatches.
|