| CPC G06F 8/38 (2013.01) [G06F 3/0481 (2013.01); G06F 16/285 (2019.01); G06Q 30/0201 (2013.01); G06Q 30/0204 (2013.01); G06Q 10/06395 (2013.01)] | 18 Claims |

|
1. A computer-implemented method comprising:
obtaining user data describing a new population of users of a software application comprising a plurality of different graphical user interfaces (GUIs), wherein the user data includes new graphical user interface (GUI) usage data generated by the new population of users;
sorting the new population of users into a plurality of lookalike cohorts comprising a lookalike cohort, wherein the lookalike cohort is similar to an existing cohort of a plurality of existing cohorts of a long-term user population of the software application, by:
obtaining a plurality of features characterizing existing GUI usage data corresponding to the long-term user population,
training an unsupervised machine learning model with the plurality of features to generate an output vector corresponding to an input to the unsupervised machine learning model, wherein the output vector is a vector of probabilities of the input being associated with the plurality of existing cohorts,
obtaining an input instance corresponding to a new user from the new population of users, wherein the input instance is a set of feature values corresponding to the new user based on the new GUI usage data,
inputting the input instance to the trained unsupervised machine learning model to obtain a new output vector corresponding to the new user, and
adding the new user to the lookalike cohort of the plurality of lookalike cohorts, wherein the lookalike cohort is assigned a long-term value similar to the existing cohort of the plurality of existing cohorts that the new user is most likely associated with based on a highest probability value of the new output vector;
generating a distribution of the plurality of lookalike cohorts;
extracting, using a random sampling algorithm, a plurality of samples from the distribution;
generating, from the plurality of samples, a normal distribution of predicted long term values of the new population of users;
selecting an expected long term value from the normal distribution;
generating, from the normal distribution, an estimated distribution, around the expected long term value, of estimated long-term values for the new population of users;
collecting, after generating the estimated distribution, at least retention data on the new population of users;
adjusting, before modifying the software application, the estimated distribution by adding the retention data to the new population of users to generate a second distribution;
generating, using a second random sampling algorithm, a second normal distribution from the second distribution;
selecting a second expected value from the second normal distribution;
generating, from the second normal distribution, a second estimated distribution of estimated long-term values for the new population of users,
wherein selecting further comprises selecting using the second estimated distribution of estimated long-term values for the new population of users;
selecting, using the expected long term value and the estimated distribution, a selected GUI from among the plurality of different GUIs; and
modifying the software application by presenting the selected GUI.
|