Downloads
- RNAz predictions The file contains the RNAz [2] predictions for 161,347 windows with p-score > 0.9. An example RNAz output is shown in the box below. Each alignment window contains the human sequence as reference which can be identified by the chromosome, the start position, the length of the sequence, and the strand (e.g., hg19.chr9 113582330 91 +). These information are unique for each window and thus are used as identifier of a window. Module predictions for a window can be assigned by means of this identifier.
For further explanations on the RNAz output see the manual in the RNAz documentation.
############################ RNAz 2.1 ############################## Sequences: 6 Columns: 102 Reading direction: forward Mean pairwise identity: 82.85 Shannon entropy: 0.30473 G+C content: 0.30707 Mean single sequence MFE: -17.65 Consensus MFE: -12.12 Energy contribution: -12.10 Covariance contribution: -0.02 Combinations/Pair: 1.35 Mean z-score: -2.19 Structure conservation index: 0.69 Background model: dinucleotide Decision model: sequence based alignment quality SVM decision value: 1.29 SVM RNA-class probability: 0.922574 Prediction: RNA ###################################################################### >hg19.chr9 113582330 91 + 141213431 GUGUACAAAAAAAA----------UAAAAUAGGUCAAUUGA-AAAUAGCUGAAAUAUUGAGGGAUAUUUAGAGUACAAGAGUACUCACAGCAGUAUUGCAAU ((((((........----------.........((....))-.....(((((((((((....))))))).((((((....)))))).)))).))).)))... ( -17.80, z-score = -2.86, R) >panTro4.chr9 109750406 91 + 137840987 GUGUACAAAAAAAA----------UAAAAUAGUUCAAUUGA-AAAUAGUUGAAAUAUUGAGGGAUAUUUAGAGUACAAGAGUACUCACAGCAGUAUUGCAAU ((((((........----------........(((....))-)....(((((((((((....))))))).((((((....)))))).)))).))).)))... ( -16.30, z-score = -2.50, R) >ponAbe2.chr9 107210115 90 + 135191526 GUGUAC-AAAAAAA----------UAAAAUAGUUCAAUUAA-AAAUAGCUGAAAUGUUGACGGAUAUUUAGGGUACAAGAGUACUCACAGCAGUAUUGCAAU ......-.......----------.................-.....(((((((((((....))))))).((((((....)))))).))))........... ( -15.50, z-score = -1.56, R) >rheMac2.chr15 84754536 90 - 110119387 GUGUAC-ACAAAAU----------AAAUAUAGUUCAGUUGA-AAAUAGCUGAAAUAUUGAGGGAUAUUUAGGGUACAAGAGUACUCACAGCAGUAUUGCAAU ((((((-.......----------........(((....))-)....(((((((((((....))))))).((((((....)))))).)))).))).)))... ( -17.80, z-score = -1.93, R) >calJac3.chr1 154282058 87 + 210400635 GUGUGC-AAAAAAU----------UAAAAU-GUUCAGUUGA-AAAUAGCUGAAAUAUUGAGGGAUAUUUAGUGCACAAGAGUA--CGCAGCAGUAUUGCGAU (((..(-.(((.((----------(..(((-(((((((((.-...)))))).))))))....))).))).)..))).......--(((((.....))))).. ( -21.10, z-score = -2.37, R) >equCab2.chr25 15886210 102 + 39536964 GUAGACCAAAAAAAAAAGUUCAAUUAAAAUAGUUUAAUUAAAAAAUAGUUCCAAUGUUGGGAGAUAUUUAGAGCAUAAGAGUGUCCACAGCAGUGCUGCAAU ((((..(..............(((((((....)))))))...............((((((((.((.((((.....)))).)).))).))))))..))))... ( -17.40, z-score = -1.92, R) >consensus GUGUAC_AAAAAAA__________UAAAAUAGUUCAAUUGA_AAAUAGCUGAAAUAUUGAGGGAUAUUUAGAGUACAAGAGUACUCACAGCAGUAUUGCAAU ...............................................(((((((((((....))))))).((((((....)))))).))))........... (-12.12 = -12.10 + -0.02)
- metaRNAmodules predictions The archive contains the metaRNAmodules (mRm) [3] predictions. The three files show the raw output of RMDetect for mRm models applied on original windows with RNAz p-score > 0.9. For the analysis mentioned in the manuscript, we extractetd module predictions that are conserved in at least 3 sequences (occur >= 50%) and with an RMDetect score greater than or equal to the 75% quantile of the RMDetect score distribution.
RMDetect (rmd) [4] output. This is a tab separated file showing the output of RMDetect for mRm module predictions. Each entry starts with a header (##) showing the window identifier in the first line. Lines starting with RF0 are individual 3D module predictions. The columns contain module ID, version of the model, sequence ID, start and end position of sequence separated by -, RMDetect score, RMDetect base pair probability, RMDetetct mutual information, start position of 3D module strands (separated by ;) in the ungapped/gapped sequence, 3D module sequence, position of first module strand/second module strand in the ungapped sequence, position of first module strand/second module strand in the gapped sequence.
For further information on the RMDetect output see the RMDetect manual.
##hg19.chr9=113582330=91=+ ## meta data: ##seq_file = /scratch/mgnt03.11206600.0/alignment390.aln ##both_strands = NO ##sequences = 6 ##tests_all = 2859905 ##tests_pairs = 170761 RF00177_579_2J02_440_446_456_463 1.0 calJac3.chr1/154282058_87_+ 0-87 7.90016E+00 1.00000E+00 7.48771E-03 13/24;40/53 UAAAAUG;AUAUUGAG 13;;;;;;/40;;;;;;; 24;;;;;;31/53;;;;;;; RF00001_5_1N8R_75_81_101_106 1.0 equCab2.chr25/15886210_102_+ 0-102 2.49749E+00 1.00000E+00 6.06523E-03 43/43;59/59 AAUAGUUC;GGAG.AU 43;;;;;;;/59;;;;-;63; 43;;;;;;;/59;;;;-;63; RF00001_196_2QBE_69_77_95_103 1.0 hg19.chr9/113582330_91_+ 0-91 6.52020E+00 1.00000E+00 6.25753E-02 76/87;30/40 CAGCA..GUAU;AAAAUAGCUG 76;;;;;-;-;81;;;/30;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; RF00059_318_2GDI_55_59_72_76 1.0 hg19.chr9/113582330_91_+ 0-91 2.20256E+00 1.00000E+00 1.97999E-02 85/96;0/0 UG..C;GUGUA 85;;-;-;87/0;;;; 96;;-;-;98/0;;;; RF00177_579_2J02_440_446_456_463 1.0 panTro4.chr9/109750406_91_+ 0-91 6.03083E+00 1.00000E+00 2.14712E-02 14/24;42/53 UAAAAUA;AUAUUGAG 14;;;;;;/42;;;;;;; 24;;;;;;/53;;;;;;; RF00059_318_2GDI_55_59_72_76 1.0 panTro4.chr9/109750406_91_+ 0-91 2.25899E+00 1.00000E+00 7.13236E-03 85/96;0/0 UG..C;GUGUA 85;;-;-;87/0;;;; 96;;-;-;98/0;;;; RF00001_196_2QBE_69_77_95_103 1.0 panTro4.chr9/109750406_91_+ 0-91 1.26033E+00 1.00000E+00 7.48771E-03 76/87;30/40 CAGCA..GUAU;AAAAUAGUUG 76;;;;;-;-;81;;;/30;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; RF00001_5_1N8R_75_81_101_106 1.0 panTro4.chr9/109750406_91_+ 0-91 4.25707E-01 1.00000E+00 2.73801E-02 77/88;51/62 AGCAGUAU;GAUAUUU 77;;;;;;;/51;;;;;; 88;;;;;;;/62;;;;;; RF00177_874_2F4V_1225_1231_1262_1268 1.0 ponAbe2.chr9/107210115_90_+ 0-90 1.07236E+01 1.00000E+00 4.38081E-02 3/3;27/38 UACAAAA;UA.AAAA 3;;;;;;/27;;-;29;;; 3;;;7;;;/38;;-;40;42;; RF00177_145_2QBF_1222_1228_1259_1265 1.0 ponAbe2.chr9/107210115_90_+ 0-90 1.07236E+01 1.00000E+00 4.38081E-02 3/3;27/38 UACAAAA;UA.AAAA 3;;;;;;/27;;-;29;;; 3;;;7;;;/38;;-;40;42;; RF00177_145_2QBF_1222_1228_1259_1265 1.0 ponAbe2.chr9/107210115_90_+ 0-90 8.36930E+00 1.00000E+00 5.15157E-02 3/3;26/37 UACAAAA;UU.AAAA 3;;;;;;/26;;-;28;;; 3;;;7;;;/37;;-;39;;42; RF00177_874_2F4V_1225_1231_1262_1268 1.0 ponAbe2.chr9/107210115_90_+ 0-90 7.57575E+00 1.00000E+00 4.38081E-02 3/3;26/37 UACAAAA;UUAAAAA 3;;;;;;/26;;;;;; 3;;;7;;;/37;;;;42;; RF00001_196_2QBE_69_77_95_103 1.0 ponAbe2.chr9/107210115_90_+ 0-90 6.42275E+00 1.00000E+00 1.62809E-01 75/87;29/40 CAGCA..GUAU;AAAAUAGCUG 75;;;;;-;-;80;;;/29;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; RF00059_318_2GDI_55_59_72_76 1.0 ponAbe2.chr9/107210115_90_+ 0-90 2.05836E+00 1.00000E+00 5.15157E-02 84/96;0/0 UG..C;GUGUA 84;;-;-;86/0;;;; 96;;-;-;98/0;;;; RF00177_1166_2UXD_1224_1231_1262_1269 1.0 ponAbe2.chr9/107210115_90_+ 0-90 1.40122E+00 1.00000E+00 4.17290E-02 2/2;27/38 GUACAAAA;UA.AAAAU 2;;;;;;;/27;;-;29;;;; 2;;;;7;;;/38;;-;40;42;;; RF00001_196_2QBE_69_77_95_103 1.0 rheMac2.chr15/84754536_90_- 0-90 6.46913E+00 1.00000E+00 1.08702E-02 75/87;29/40 CAGCA..GUAU;AAAAUAGCUG 75;;;;;-;-;80;;;/29;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;;
-
RMCluster overview (rmccluster). The raw output of RMDetect is clustered according to the position of the indivudual predictions. The overview file starts with a header showing the window ID. Furthermore, a summary of each cluster (including cluster ID, model ID, number of predictions, percentage of occurence, mean RMDetect score, mean RMDetect base pair probability, mean RMDetect mutual information, 3D module columns in the window) is presented.
##hg19.chr9=113582330=91=+ Cluster: 1 -> model: RF00001_196_2QBE_69_77_95_103 count: 4 occur_(%): 66.67 score: 5.168 bpp: 0.061 MI: 0.000 H: 0.000 cols: 87 40 Cluster: 2 -> model: RF00059_318_2GDI_55_59_72_76 count: 3 occur_(%): 50.00 score: 2.173 bpp: 0.026 MI: 0.000 H: 0.000 cols: 96 0
RMCluster (rmc) output. This file shows the clusters and its members that are summarized in the overview file. The file shows the window ID and the cluster number preceded by ##. The columns contain module ID, version of the model, sequence ID, start and end position of sequence separated by -, RMDetect score, RMDetect base pair probability, RMDetetct mutual information, start position of 3D module strands (separated by ;) in the ungapped/gapped sequence, 3D module sequence, position of first module strand/second module strand in the ungapped sequence, position of first module strand/second module strand in the gapped sequence.
##hg19.chr9=113582330=91=+ ##Cluster001 RF00001_196_2QBE_69_77_95_103 1.0 hg19.chr9/113582330_91_+ 0-91 6.52020E+00 1.00000E+00 6.25753E-02 76/87;30/40 CAGCA..GUAU;AAAAUAGCUG 76;;;;;-;-;81;;;/30;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; RF00001_196_2QBE_69_77_95_103 1.0 panTro4.chr9/109750406_91_+ 0-91 1.26033E+00 1.00000E+00 7.48771E-03 76/87;30/40 CAGCA..GUAU;AAAAUAGUUG 76;;;;;-;-;81;;;/30;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; RF00001_196_2QBE_69_77_95_103 1.0 ponAbe2.chr9/107210115_90_+ 0-90 6.42275E+00 1.00000E+00 1.62809E-01 75/87;29/40 CAGCA..GUAU;AAAAUAGCUG 75;;;;;-;-;80;;;/29;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; RF00001_196_2QBE_69_77_95_103 1.0 rheMac2.chr15/84754536_90_- 0-90 6.46913E+00 1.00000E+00 1.08702E-02 75/87;29/40 CAGCA..GUAU;AAAAUAGCUG 75;;;;;-;-;80;;;/29;;;;;;;;; 87;;;;;-;-;92;;;/40;42;;;;;;;; ##hg19.chr9=113582330=91=+ ##Cluster002 RF00059_318_2GDI_55_59_72_76 1.0 hg19.chr9/113582330_91_+ 0-91 2.20256E+00 1.00000E+00 1.97999E-02 85/96;0/0 UG..C;GUGUA 85;;-;-;87/0;;;; 96;;-;-;98/0;;;; RF00059_318_2GDI_55_59_72_76 1.0 panTro4.chr9/109750406_91_+ 0-91 2.25899E+00 1.00000E+00 7.13236E-03 85/96;0/0 UG..C;GUGUA 85;;-;-;87/0;;;; 96;;-;-;98/0;;;; RF00059_318_2GDI_55_59_72_76 1.0 ponAbe2.chr9/107210115_90_+ 0-90 2.05836E+00 1.00000E+00 5.15157E-02 84/96;0/0 UG..C;GUGUA 84;;-;-;86/0;;;; 96;;-;-;98/0;;;;
- JAR3D IL predictions (seq results)
- JAR3D IL predictions (loop results)
- JAR3D HL predictions (seq results)
- JAR3D HL predictions (loop results) JAR3D [5] predictions for internal (IL) and hairpin (HL) loops. The files contain JAR3D predictions with passed cutoff >= 50 and mean interior edit distance <= 4 for windows with RNAz p-score > 0.9.
Sequence result files show the results of each JAR3D module for each sequence (six lines per module). Loop result files show the the summary of the sequence results (one line per module). Example is shown below in the box. Both file types are comma separated files.
For further information on the JAR3D output see [5].
Sequence result for module IL_00998.1 for internal loop 37_44++68_86 of window hg19.chr9=8996009=92=+ ##hg19.chr9=8996009=92=+ ##37_44++68_86 filename,sequenceId,motifId,passedCutoff,score,percentile,interiorEditDistance,fullEditDistance,rotation hg19.chr9=8996009=92=+=141213431=IL-37_44++68_86,calJac3.chr1_110701931_92_+_210400635_IL-37_44++68_86, IL_00998.1,false,-1010.0,0.0000,15,17,0.0000 hg19.chr9=8996009=92=+=141213431=IL-37_44++68_86,rheMac2.chr15_41977632_92_-_110119387_IL-37_44++68_86, IL_00998.1,false,-1018.5,0.0000,16,17,0.0000 hg19.chr9=8996009=92=+=141213431=IL-37_44++68_86,panTro4.chr9_9097028_92_+_137840987_IL-37_44++68_86, IL_00998.1,false,-1013.0,0.0000,16,17,0.0000 hg19.chr9=8996009=92=+=141213431=IL-37_44++68_86,hg19.chr9_8996009_92_+_141213431_IL-37_44++68_86, IL_00998.1,false,-1013.0,0.0000,16,17,0.0000 hg19.chr9=8996009=92=+=141213431=IL-37_44++68_86,equCab2.chr23_29761663_92_+_55726280_IL-37_44++68_86, IL_00998.1,false,-1012.8,0.0000,18,21,0.0000 hg19.chr9=8996009=92=+=141213431=IL-37_44++68_86,ponAbe2.chr9_81427788_92_-_135191526_IL-37_44++68_86, IL_00998.1,false,-1013.0,0.0000,16,17,0.0000 Loop result ##hg19.chr9=8996009=92=+ ##37_44++68_86 filename,motifId,%passedCutoff,meanScore,medianScore,meanPercentile,medianPercentile,meanInteriorEditDistance, medianInteriorEditDistance,meanFullEditDistance,medianFullEditDistance,rotation hg19.chr9=8996009=92=+=141213431_IL-37_44++68_86,IL_00998.1,0.0000,-1013.4,-1013.0,0.0000,0.0000,16.167, 16.000,17.667,17.000,0.0000
References
- Theis C, Zirbel CL, Höner zu Siederdissen C, Anthon C, Hofacker IL, Nielsen H, Gorodkin J. RNA 3D modules in genome-wide predictions of RNA 2D structure. (in preparation)
- Gruber AR, Findeiß S, Washietl S, Hofacker IL, Stadler PF (2010). RNAz 2.0: improved noncoding RNA detection. Pac. Symp. Biocomput., pages 69-79.
- Theis C, Höner zu Siederdissen C, Hofacker IL, Gorodkin J (2013). Automated identification of RNA 3D modules with discriminative power in RNA structural alignments. Nucleic Acids Research 41(22):9999-10009.
- Cruz JA, Westhof E (2011). Sequence-based identification of 3D structural modules in RNA with RMDetect. Nature Methods 8(6):513-521
- Zirbel CL, Roll J, Blake SA, Petrov AI, Pirrung M, Leontis, NB (2015). Identifying novel sequence variants of RNA 3D motifs. Nucleic Acids Res. 49. doi: 10.1093/nar/gkv651