mirDBA: microRNA discovery by similarity search to a database of RNA-seq profiles

To identify putative miRNA candidates, please provide the mapped reads from a short RNA-seq experiment in BED format. Input fields marked with an * are required.

Input

Reference genome [?]
Enter BED file*
or Upload BED file  (If BAM, please covert to BED using this script)
Database [?]
E-mail

 Optional parameters

a) Block alignment
Match score[?]
Mismatch score[?]
Gap penalty (initialize)[?]
Gap penalty (extension)[?]
Threshold (expression difference)[?]
b) Block group alignment
Distance weight[?]
Block similarity weight[?]
Gap penalty[?]
c) Alignment
Threshold score[?]
Alignments per query[?]
       Use example dataset

 Frequently Asked Questions

Q.For what this web server can be used for?
A.The web server can be used to compare the read profiles from a query RNA-seq experiment against our database of miRNA read profiles (miRRPdb). The database is comprised of 2,540 miRNA read profiles compiled from analysis of short RNA-seq data from miRBase.
Q.Why input format is not BAM?
A.We decided to use input in BED format due to two main reasons:
  • Conversion from BAM to BED significantly reduces the size of file that needs to be uploaded for example this RNA-seq dataset of size 1.3 GB from ENCODE is reduced to 23 MB on conversion to BED format. Low file size eventually leads to faster upload and less load on the web server.
  • We provide a non-dependency perl script, bam2bed.pl that along with conversion to BED format also provides information about multiple and uniquely mapped reads. This information can be very important for user to post-filter read profiles that are comprised of only uniquely mapped reads. Alternatively, bamtobed utility from bedtools can be used to convert BAM into BED format.
Q.Is there any upper limit to the size of dataset that can be analyzed?
A.Yes, currently web server supports maximum upload size of 30 MB. In our experience, most short RNA-seq datasets are below this limit. Furthermore, we can only process maximum 3,000 block groups processed from input BED file. In case, the number of block groups exceeds this limit, we provide the user an option to analyze 3,000 most highly expressed block groups. Alternatively, user can split the input dataset based on chromosome for analysis. However, user should seldomly encounter such situation as on average, ~500-1000 read profiles are processed from BED files from short RNA-seq experiments.
Q.What is the average computation time to analyze input datasets?
A.User can expect to get results in around 10-60 minutes depending on the size of input dataset. We encourage users to provide email id while submitting large dataset.

 Citation

 If you use mirDBA in your research, please cite:
  • Pundhir S and Gorodkin J (2013). MicroRNA discovery by similarity search to a database of RNA-seq profiles. Front. Genet. 4:133. doi: 10.3389/fgene.2013.00133
  • Langenberger D*, Pundhir S*, Ekstrøm CT, Stadler PF, Hoffmann S and Gorodkin J.(2012) deepBlockAlign: a tool for aligning RNA-seq profiles of read block patterns. Bioinformatics. 28(1):17-24 [PMID: 22053076] (*joint first authors)