| CPC G16B 40/20 (2019.02) | 20 Claims |

|
1. A computer-implemented method comprising:
training a target machine learning model by:
generating, utilizing the target machine learning model, a training predicted bioactivity result from a compound-protein machine learning representation comprising a plurality of training match scores;
comparing the training predicted bioactivity result from the compound-protein machine learning representation with a ground truth bioactivity result to determine a measure of loss; and
modifying parameters of the target machine learning model based on the measure of loss;
generating, utilizing a trained compound-protein interaction machine learning model, a plurality of match scores for a plurality of compound-protein pairs corresponding to a query compound and a plurality of proteins by generating a match score of the plurality of match scores by stripping one or more layers from the trained compound-protein interaction machine learning model, wherein the plurality of match scores comprise binding probabilities between the query compound and the plurality of proteins;
generating, utilizing the target machine learning model, a predicted bioactivity result for the query compound from the plurality of match scores for the plurality of compound-protein pairs; and
determining, utilizing a machine learning explainability model, one or more proteins from the plurality of compound-protein pairs contributing to the predicted bioactivity result for the query compound by modifying features comprising the plurality of match scores for the plurality of compound-protein pairs input to the target machine learning model to determine modified bioactivity results and marginal contributions of the plurality of proteins in generating the modified bioactivity results utilizing the target machine learning model.
|