US 12,188,081 B2
Bioinformatics methods of in silico validation and selection of circRNAs
Shobana Sekar, Phoenix, AZ (US); Winnie Liang, Phoenix, AZ (US); and Jonathan Keats, Phoenix, AZ (US)
Assigned to The Translational Genomics Research Institute, Phoenix, AZ (US)
Filed by THE TRANSLATIONAL GENOMICS RESEARCH INSTITUTE, Phoenix, AZ (US)
Filed on Jan. 24, 2020, as Appl. No. 16/752,115.
Claims priority of provisional application 62/807,714, filed on Feb. 19, 2019.
Claims priority of provisional application 62/796,491, filed on Jan. 24, 2019.
Prior Publication US 2020/0239939 A1, Jul. 30, 2020
Int. Cl. C12Q 1/6809 (2018.01); C12N 9/22 (2006.01); C12N 15/113 (2010.01); C12Q 1/686 (2018.01); G16B 30/00 (2019.01); G16B 50/30 (2019.01)
CPC C12Q 1/6809 (2013.01) [C12N 9/22 (2013.01); C12N 15/113 (2013.01); C12Q 1/686 (2013.01); G16B 30/00 (2019.02); G16B 50/30 (2019.02); C12N 2310/141 (2013.01)] 18 Claims
 
1. A bioinformatics method of selecting a circRNA for further analysis comprising:
receiving with a processor:
a sequence file of a dataset comprising RNA sequencing reads,
a reference genome, and
a circRNA junction;
extracting start-reads and end-reads from the sequence file with the processor;
generating a plurality of contigs by assembling the start-reads and end-reads with the processor;
extracting a start-sequence and an end-sequence from the reference genome with the processor;
concatenating the start-sequence and the end-sequence into at least one pseudo-reference;
aligning the plurality of contigs with the at least one pseudo-reference; and
selecting the circRNA for performing functional studies for the circRNA if the plurality of contigs overlaps with the at least one pseudo-reference above a threshold stringency and the threshold stringency requires a minimum of a 10-bp overlap between the plurality of contigs and the at least one pseudo-reference on either side of the circRNA junction;
wherein the functional study comprises (i) use of the circRNA as a probe to identify a biomolecule that binds to the circRNA; (ii) querying of circRNA sequences against known gene sequences to identify genes or transcripts that are regulated by specific circRNAs; (iii) knock-down or knock-out of the circRNA studies; or (iv) determining the in vivo functional consequence of loss-of-function circRNA phenotype.