US 12,436,967 B2
	Visualizing feature variation effects on computer model prediction
Kin Kwan Leung, Toronto (CA); Barum Rho, Toronto (CA); Yaqiao Luo, Toronto (CA); Valentin Tsatskin, Toronto (CA); Derek Cheung, Toronto (CA); and Kyle William Hall, Baltimore (CA)
Assigned to The Toronto-Dominion Bank, Toronto (CA)
Filed by THE TORONTO-DOMINION BANK, Toronto (CA)
Filed on May 12, 2022, as Appl. No. 17/743,173.
Claims priority of provisional application 63/213,684, filed on Jun. 22, 2021.
Prior Publication US 2022/0405299 A1, Dec. 22, 2022
Int. Cl. G06F 16/26 (2019.01); G06F 16/28 (2019.01)

CPC G06F 16/26 (2019.01) [G06F 16/283 (2019.01); G06F 16/285 (2019.01)]

20 Claims

1. A system for visualizing feature variation effects on computer model prediction, comprising:

a processor; and

a computer-readable medium having instructions executable by the processor for:

identifying a trained computer model configured to generate an output value based on an input having a plurality of features;

identifying a data set having a plurality of data instances, each data instance having a feature vector corresponding to the plurality of features;

for each data instance in the plurality of data instances, generating an associated instance-feature variation plot describing model outputs of the trained computer model for a range of values for a first feature in the feature vector for the data instance;

clustering the plurality of data instances to a plurality of clusters based on the associated instance-feature variation plots, each cluster describing data instances having similar model outputs with respect to the range of values for the first feature;

labeling each of the plurality of data instances with the respective cluster associated with the instance;

training an interpretation model to output predicted membership of a data instance in one or more clusters of the plurality of clusters based on features of the plurality of features other than the first feature with training data including the plurality of data instances using the associated cluster label as the output to be learned by the interpretation model;

determining, based on the trained interpretation model, a second feature of the plurality of features, different from the first feature, and a decision value of the second feature that predicts membership in a first cluster of the plurality of clusters relative to a second cluster of the plurality of clusters, such that the second feature and decision value most correlate with the cluster membership describing data instances having similar model outputs of the trained computer model for the range of values for the first feature; and

providing the clustered data instances for display to a user to view the effects of the first feature on the model outputs and an indication of the decision value of the second feature.