US 12,346,821 B2
Methods and systems for enhanced sensor assessments for predicting secondary endpoints
John Wesley Gottula, Fuquay-Varina, NC (US)
Assigned to SAS INSTITUTE INC., Cary, NC (US)
Filed by SAS INSTITUTE INC., Cary, NC (US)
Filed on Apr. 17, 2024, as Appl. No. 18/637,794.
Claims priority of provisional application 63/541,549, filed on Sep. 29, 2023.
Claims priority of provisional application 63/459,998, filed on Apr. 17, 2023.
Prior Publication US 2024/0346382 A1, Oct. 17, 2024
Int. Cl. G06N 20/00 (2019.01); G06F 18/20 (2023.01); G06F 18/2135 (2023.01); G06N 3/09 (2023.01)
CPC G06N 3/09 (2023.01) [G06F 18/2135 (2023.01); G06F 18/295 (2023.01); G06N 20/00 (2019.01)] 30 Claims
OG exemplary drawing
 
1. A computer-program product comprising a non-transitory machine-readable storage medium storing computer instructions that, when executed by one or more processors, perform operations comprising:
obtaining a corpus of raw sensor data based on observations captured by one or more sensors;
transforming the corpus of raw sensor data to an aggregated sensor data set based on applying a data aggregation technique that was selected from a plurality of data aggregation techniques;
selecting a set of features from the aggregated sensor data set based on applying to the aggregated sensor data set a feature selection algorithm that was selected from a plurality of heterogenous feature selection algorithms based on having a highest feature selection accuracy for a machine learning model; and
predicting, by the machine learning model that was selected from a plurality of heterogeneous machine learning models, a value of a secondary endpoint based on an input of the selected set of features to the machine learning model, wherein the machine learning model was specifically trained to use sensor data output from the one or more sensors using an iterative machine learning model optimization process comprising:
(i) executing by the one or more processors a multi-stage pipeline that adjusts hyperparameters of the machine learning model, feature selection criteria for the machine learning model, and model architectures of the machine learning model in response to dynamic evaluations of performance metrics of the machine learning model;
(ii) initializing a plurality of candidate heterogeneous machine learning models with varied hyperparameters and feature selection criteria specific to the aggregated sensor data;
(iii) executing a cross-validation procedure for the plurality of candidate heterogeneous machine learning models using a holdout validation data set derived from the aggregated sensor data set;
(iv) evaluating predictive accuracy of the plurality of candidate heterogeneous machine learning models by computing model performance metrics;
(v) iteratively adjusting hyperparameters for training the plurality of candidate heterogeneous machine learning models and re-selecting feature sets based on computed model performance metrics of the training;
(vI) ranking the plurality of candidate heterogeneous machine learning models, once trained, based on the computed performance metrics and selecting the machine learning model from the plurality of candidate heterogeneous machine learning models based on having a highest predictive accuracy for the secondary endpoint,
wherein the iterative machine learning model optimization process causes the machine learning model to be specifically configured for the one or more sensors that outputs predictions with a quantified accuracy level derived from an evaluation of the computed performance metrics, and
wherein outputs of the one or more sensors including the corpus of raw sensor data are indirectly related to an ultimate value of the secondary endpoint; and
determining whether the value of the secondary endpoint predicted by the machine learning model satisfies a desired value of the secondary endpoint,
wherein if the value of the secondary endpoint predicted by the machine learning model does not satisfy the desired value of the secondary endpoint, outputting via a graphical user interface one or more proposed adjustments that, when executed, likely facilitates the ultimate value of the secondary endpoint towards the desired value of the secondary endpoint.