US 12,176,068 B2
Methods for predicting genomic variation effects on gene transcription
Jian Zhou, Dallas, TX (US); Chandra Theesfeld, Princeton, NJ (US); and Olga G. Troyanskaya, Princeton, NJ (US)
Assigned to The Trustees of Princeton University, Princeton, NJ (US); and The Simons Foundation, Inc., New York, NY (US)
Appl. No. 17/041,836
Filed by The Trustees of Princeton University, Princeton, NJ (US); and The Simons Foundation, Inc., New York, NY (US)
PCT Filed Mar. 26, 2019, PCT No. PCT/US2019/024108
§ 371(c)(1), (2) Date Sep. 25, 2020,
PCT Pub. No. WO2019/191123, PCT Pub. Date Oct. 3, 2019.
Claims priority of provisional application 62/648,355, filed on Mar. 26, 2018.
Prior Publication US 2021/0027855 A1, Jan. 28, 2021
Int. Cl. G16B 20/20 (2019.01); G16B 20/50 (2019.01); G16B 25/10 (2019.01)
CPC G16B 20/20 (2019.02) [G16B 20/50 (2019.02); G16B 25/10 (2019.02)] 17 Claims
OG exemplary drawing
 
1. A method to perform site-directed mutagenesis on a biological cell, the method comprising:
obtaining genetic sequence data that includes a nucleotide sequence of a sequence structure of at least one gene;
determining, utilizing a computational framework, an expression level of the at least one gene,
wherein the computational framework includes a deep convolutional neural network and a linear regression model,
wherein the deep convolutional neural network of the computational framework utilizes the genetic sequence data to determine epigenetic regulatory features along a genetic sequence that includes the nucleotide sequence of the sequence structure of the at least one gene,
wherein the linear regression model of the computational framework determines the expression level of the at least one gene based on the epigenetic regulatory features along the genetic sequence that includes the nucleotide sequence of the sequence structure of the at least one gene sequence;
identifying, utilizing the computational framework, a set of one or more genetic variants within the genetic sequence data that includes the nucleotide sequence of the sequence structure of the at least one gene, wherein the set of one or more genetic variants alter the expression level of the at least one gene as determined by an effect of the set of one or more genetic variants on the epigenetic regulatory features along the genetic sequence that includes the nucleotide sequence of the sequence structure of at least one gene sequence; and
based on the identification of the set of variants that alter the expression level of the at least one gene, performing site-directed mutagenesis on DNA of a biological cell to introduce the identified genetic set of variants within a corresponding genetic sequence of the DNA that includes a corresponding nucleotide sequence of the sequence structure of the at least one gene such that the expression level of the at least one gene is altered in the biological cell.