CPC C12Q 1/6809 (2013.01) [C12N 9/22 (2013.01); C12N 15/113 (2013.01); C12Q 1/686 (2013.01); G16B 30/00 (2019.02); G16B 50/30 (2019.02); C12N 2310/141 (2013.01)] | 18 Claims |
1. A bioinformatics method of selecting a circRNA for further analysis comprising:
receiving with a processor:
a sequence file of a dataset comprising RNA sequencing reads,
a reference genome, and
a circRNA junction;
extracting start-reads and end-reads from the sequence file with the processor;
generating a plurality of contigs by assembling the start-reads and end-reads with the processor;
extracting a start-sequence and an end-sequence from the reference genome with the processor;
concatenating the start-sequence and the end-sequence into at least one pseudo-reference;
aligning the plurality of contigs with the at least one pseudo-reference; and
selecting the circRNA for performing functional studies for the circRNA if the plurality of contigs overlaps with the at least one pseudo-reference above a threshold stringency and the threshold stringency requires a minimum of a 10-bp overlap between the plurality of contigs and the at least one pseudo-reference on either side of the circRNA junction;
wherein the functional study comprises (i) use of the circRNA as a probe to identify a biomolecule that binds to the circRNA; (ii) querying of circRNA sequences against known gene sequences to identify genes or transcripts that are regulated by specific circRNAs; (iii) knock-down or knock-out of the circRNA studies; or (iv) determining the in vivo functional consequence of loss-of-function circRNA phenotype.
|