| CPC G16B 40/20 (2019.02) [G06N 3/04 (2013.01); G06N 3/086 (2013.01); G06N 20/00 (2019.01); G16B 40/00 (2019.02); G06F 17/00 (2013.01); G16Z 99/00 (2019.02)] | 17 Claims |

|
1. A method, performed by a computing system having at least one processor and at least one memory, for discovering features for use in a trained machine learning model for diagnosing medical conditions, the method comprising:
for each of a plurality of feature generators,
for each of a plurality of sets of data signals,
extracting values from a particular set of data signals,
transforming the particular set of data signals by using the extracted values to generate a set of normalized values,
and
applying a particular feature generator to the set of normalized values to produce a feature value, and
generating a set of feature vectors based on the produced feature values;
for each of a plurality of the generated feature vectors, calculating a novelty score;
identifying one or more feature generators from among the plurality of feature generators whose first calculated novelty score exceeds a novelty threshold;
generating a mutated one or more feature generators, comprising applying at least one of a point mutation, random recombination, sub-tree mutation, or a combination thereof to the one or more feature generators; and
using the mutated one or more feature generators, performing operations comprising:
generating additional training data comprising the mutated one or more feature generators;
processing the additional training data to discard the mutated one or more feature generators where a second novelty score of the mutated one or more feature generators is under the novelty threshold;
adding, to a machine learning pipeline, the processed additional training data; and
causing at least one trained machine learning model in the machine learning pipeline to be incrementally retrained using the processed additional training data.
|