US 12,248,889 B2
Stochastic risk scoring with counterfactual analysis for storage capacity
Rahul Deo Vishwakarma, Kolkata (IN); Bing Liu, Tianjin (CN); and Parmeshwr Prasad, Bangalore (IN)
Assigned to EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on Jan. 20, 2021, as Appl. No. 17/153,294.
Prior Publication US 2022/0230083 A1, Jul. 21, 2022
Int. Cl. G06N 7/01 (2023.01); G06F 3/06 (2006.01)
CPC G06N 7/01 (2023.01) [G06F 3/0653 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
accessing a dataset;
selecting a list of attributes of the dataset, each of the attributes being selected based on a determination that the attribute is affecting growth of the dataset and affecting an amount of data storage space consumed by the dataset;
assigning a SHAP score to each attribute;
using the SHAP scores to assign respective weights to each attribute;
deriving drift and shock information for the dataset, and the drift and shock information is derived from the SHAP scores;
based on the drift and shock information, calculating a risk score that a storage capacity of an asset where the dataset is stored will be exhausted within a particular time interval; and
using the risk score as a basis to identify, and implement, an action to reduce a risk that the storage capacity of the asset will be exhausted within the particular time interval.