US 11,868,503 B2
	Recommending post modifications to reduce sensitive data exposure
Gray Franklin Cannon, Miami, FL (US); Indervir Singh Banipal, Austin, TX (US); Shikhar Kwatra, San Jose, CA (US); and Raghuveer Prasad Nagar, Rajasthan (IN)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Nov. 24, 2020, as Appl. No. 17/103,353.
Prior Publication US 2022/0164472 A1, May 26, 2022
Int. Cl. G06F 21/62 (2013.01)

CPC G06F 21/6245 (2013.01)

20 Claims

1. A computer-implemented method comprising:

associating post data of a request with a category from a predefined list of categories, the post data including data representative of post content authored by a user, the category being selected based on the post content and having a strongest correlation with the post content as compared to correlations of other categories in the list of categories;

adjusting a risk value associated with the category, by changing the risk value to a new risk value, the new risk value being based on an activity performed by an intended audience of the post content;

analyzing, using a machine learning classifier, the post data for potentially sensitive content;

generating, responsive to the analyzing of the post data, a first sensitive data indicator associated with the post data and a first confidence value, wherein the first sensitive data indicator identifies the post data as potentially containing sensitive information corresponding to the category, and wherein the first confidence value represents a first degree of certainty with which the machine learning classifier determined that the post data contains sensitive information;

generating, using an explanation algorithm, explanatory data associated with the first sensitive data indicator of the post data, wherein the explanatory data identifies (i) a feature of the post data that contributed to the machine learning classifier identifying the post data as potentially containing sensitive information, and (ii) a property of the intended audience at a time of the request;

generating, using a remedy module, a modified version of the post data that changes the feature to have a modified feature value, wherein a remedy option selector analyzes the post data in parallel with other processing to expedite processing of the post data;

analyzing, using the machine learning classifier, the modified version of the post data for potentially sensitive content;

generating, responsive to the analyzing of the modified version of the post data, a second sensitive data indicator associated with the modified version of the post data and a second confidence value, wherein the second sensitive data indicator identifies the modified version of the post data as potentially containing sensitive information, and wherein the second confidence value represents a second degree of certainty with which the machine learning classifier determined that the modified version of the post data contains sensitive information,

wherein the second confidence value differs from the first confidence value so as to indicate that the post data is more likely to contain sensitive data than the modified version of the post data; and

issuing an alert to the user indicating that the post data potentially contains sensitive data and recommending a post modification that changes the feature to have the modified feature value.