US 12,260,938 B2
Machine learning driven gene discovery and gene editing in plants
Bradley Zamft, Mountain View, CA (US); Vikash Singh, Los Angeles, CA (US); Mathias Voges, Cupertino, CA (US); and Thong Nguyen, Pasadena, CA (US)
Assigned to HERITABLE AGRICULTURE INC., Mountain View, CA (US)
Filed by HERITABLE AGRICULTURE INC., Mountain View, CA (US)
Filed on Mar. 19, 2021, as Appl. No. 17/207,169.
Prior Publication US 2022/0301658 A1, Sep. 22, 2022
Int. Cl. G16B 40/00 (2019.01); G16B 5/20 (2019.01)
CPC G16B 40/00 (2019.02) [G16B 5/20 (2019.02)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
obtaining a set of gene expression profiles for a set of genes measured in a tissue sample of a plant;
inputting the set of gene expression profiles into a prediction model having a deep learning architecture constructed for a task of predicting a phenotype as output data by learning relationships or correlations between features of the set of gene expression profiles and the phenotype;
generating, using the prediction model, the prediction of the phenotype for the plant based on the relationships or the correlations between the features of the set of gene expression profiles and the phenotype;
analyzing, by an explainable artificial intelligence system, decisions made by the prediction model to predict the phenotype, wherein the analyzing comprises: (i) generating a set of feature importance scores for the features used in the prediction of the phenotype, wherein the feature importance scores represent estimates of each features contribution or influence on the prediction of the phenotype, and (ii) ranking or otherwise sorting the features based on the feature importance score associated with each of the features, wherein highest ranking or sorted features are identified as having a largest contribution or influence on the prediction of the phenotype;
identifying, based on the ranked or otherwise sorted features, a set of candidate gene targets for the phenotype as having the largest contribution or influence on the prediction of the phenotype;
inputting the set of candidate gene targets into a gene edit modeling system having an architecture constructed to model gene edits and generate ideal gene expression profiles for the phenotype using one or more modeling approaches and the set of candidate gene targets, wherein the one or more modeling approaches generate the ideal gene expression profiles based on the feature importance scores, an optimization algorithm, the prediction model, or any combination thereof;
generating, using the gene edit modeling system, an ideal gene expression profile for the phenotype based on an optimal set of genetic targets for editing of each gene within the set of candidate gene targets, wherein the ideal gene expression profile is a recommendation of gene expression for the set of candidate gene targets for maximizing, minimizing, or otherwise modulating the phenotype; and
making, using a gene editing system, a genetic edit or perturbation to a genome of the plant based on the optimal set of genetic targets for editing each gene and the ideal gene expression profile.