PigEST - ncRNAs


We constructed a pipeline, EST2ncRNA, to search for known and putative novel ncRNAs and cis-regulatory RNA elements in the PigEST data. The pipeline utilises sequence similarity to ncRNA databases (blast), structure similarity to Rfam (RaveNnA) as well as multiple alignments (Clustalw) to predict conserved novel putative RNA structures (RNAz).
EST2ncRNA was fed with 48,000 contigs and 73,000 singletons available from the PigEST resource. We identified known RNA structures (excluded tRNAs, Rfam identifier RF00005) in 137 contigs and single reads (conreads), and predicted high confidence novel RNA structures in non-protein coding regions of additional 1,280 conreads. To sum up, the porcine transcriptome comprises trans-acting elements (ncRNAs) in 715 contigs and 340 singletons as well as cis-acting elements (inside UTRs) in 311 contigs and 51 singletons.

The menu entry Known RNA families lists all known non-coding RNAs as well as cis-acting RNAs. They were found by primary sequence similarity (local blastn, E-value less than 1e-20, identity > 95%, subject coverage > 85%) to RNAdb (Aug. 2004), Rfam (release 7.0, March 2005), miRBase (release 8.1, May 2006) and fantom3_noncoding (release 3.0) and secondary structure similarity (RaveNnA) to Rfam 7.0 covariance models.

Top Candidates lists the high confidence candidates of ncRNAs, microRNAs and cis-acting RNAs expressed in the PigEST data.

Database Search allows the search for a special PigEST conread. Here, also candidates with slightly more relaxed criteria are presented.

Download offers additional files to the project.

The complete pigEST vs cow genome map, which was the initialization step of the ncRNA predictions to find homologous sequences to the pig conreads, is available under https://rth.dk/resources/pigest/more/genomemap.

Details can be found in "Detection of RNA structures in porcine EST data and related mammals" Seemann et al. (Submitted).