US 12,406,749 B2
	Systems and methods for predicting repair outcomes in genetic engineering
Max Walt Shen, Cambridge, MA (US); Jonathan Yee-Ting Hsu, Cambridge, MA (US); Mandana Arbab, Cambridge, MA (US); David K. Gifford, Cambridge, MA (US); David R. Liu, Cambridge, MA (US); and Richard Irving Sherwood, Cambridge, MA (US)
Assigned to The Broad Institute, Inc., Cambridge, MA (US); Massachusetts Institute of Technology, Cambridge, MA (US); The Brigham and Women's Hospital, Inc., Boston, MA (US); and President and Fellows of Harvard College, Cambridge, MA (US)
Appl. No. 16/772,747
Filed by The Broad Institute, Inc., Cambridge, MA (US); Massachusetts Institute of Technology, Cambridge, MA (US); The Brigham and Women's Hospital, Inc., Boston, MA (US); and President and Fellows of Harvard College, Cambridge, MA (US)
PCT Filed Dec. 15, 2018, PCT No. PCT/US2018/065886 § 371(c)(1), (2) Date Jun. 12, 2020, PCT Pub. No. WO2019/118949, PCT Pub. Date Jun. 20, 2019.
Claims priority of provisional application 62/669,771, filed on May 10, 2018.
Claims priority of provisional application 62/599,623, filed on Dec. 15, 2017.
Prior Publication US 2022/0238182 A1, Jul. 28, 2022
Int. Cl. G16B 20/30 (2019.01); A61K 31/7088 (2006.01); A61K 38/46 (2006.01); C12N 9/22 (2006.01); C12N 15/10 (2006.01); C12N 15/11 (2006.01); G16B 40/20 (2019.01)

CPC G16B 20/30 (2019.02) [A61K 31/7088 (2013.01); A61K 38/465 (2013.01); C12N 9/22 (2013.01); C12N 15/1089 (2013.01); C12N 15/11 (2013.01); G16B 40/20 (2019.02); C12N 2310/20 (2017.05); C12N 2800/80 (2013.01)]

13 Claims

1. A method of introducing a genetic change into a target genomic location encoding a pathogenic allele to modify the pathogenic allele to become a non-pathogenic allele using a Cas-based double strand break genome editing system, the method comprising:

using a computer hardware processor to perform:

selecting a guide RNA for use in introducing the genetic change into the target genomic location by analyzing inputs indicating a nucleotide sequence of the target genomic location and one or more available cut sites for the Cas-based double strand break genome editing system, the selecting comprising:

(a) determining a microhomology score matrix using a first neural network, the determining comprising:

determining a plurality of pairs of overhang sequences using the inputs;

determining a microhomology length vector and/or a microhomology GC fraction vector using the inputs; and

applying the first neural network to the plurality of pairs of overhang sequences and the microhomology length vector and/or the microhomology GC fraction vector to obtain the microhomology score matrix;

(b) determining a microhomology-independent score matrix using a second neural network, the determining comprising:

determining a deletion length vector using the inputs and the plurality of pairs of overhang sequences; and

applying the second neural network to the deletion length vector to obtain the microhomology-independent score matrix;

(d) determining, using the microhomology score matrix, the microhomology-independent score matrix and the probability distribution over 1-bp insertion, a probability distribution over indel genotypes and a probability distribution over indel lengths for the nucleotide sequence of the target genomic location and the one or more available cut sites;

(e) determining, using the probability distribution over indel genotypes and the probability distribution over indel lengths, for each guide RNA of a plurality of guide RNAs, a predicted frequency of introducing the genetic change into the target genomic location using the Cas-based double strand break genome editing system and the guide RNA;

(f) selecting, using the predicted frequencies of (e), a guide RNA of the plurality of guide RNAs for use in introducing the genetic change into the target genomic location using the Cas-based double strand break genome editing system; and

introducing the genetic change into the target genomic location using the guide RNA selected at (f) and the Cas-based double strand break genome editing system, wherein the genetic change is a 1 base pair insertion or a 1−60 base pair deletion that modifies the pathogenic allele to become the non-pathogenic allele.