First, please keep the programs within your research group and refer
to me if people outside your group wants predictions made. Also,
please don't adjust the grammarfiles for other research without asking
me first.

Here we go. I have made a fasta file `example.fasta' and put it
in the example directory. All the following commands are done from
there.

First we need to change the format of the file to the so-called `col'
format, which I use:

  ../bin/fasta2col example.fasta | sed 's/arbitrary/RNA/g' > example.col

`fasta2col' is an awk script, so you may have to adjust the first line
of it to make it work. If there are any problems, let me know. The col
file is a text file which should be more or less self-explanatory if
you want to look at it. It is not very compact, however.

Then we need to find the phylogeny of the sequences using the
neighbour joining approach:

  ../bin/findphyl ../bin/scfg.rate example.col > example_nj.col

The tree is now the first entry in the new col file. All the sequences
are still there. `scfg.rate' is an evolutionary rate matrix that is
made from RNA data from Knudsen and Hein (1999).

We can maximum likelihood estimate the branch lengths:

  ../bin/mltree ../bin/scfg.rate example_nj.col > example_ml.col

If you want to, you can draw the tree (as postscript):

  ../bin/drawphyl --mult=5 example_ml.col > tree.ps

Now, we are ready to do the analysis:

  ../bin/scfg --treeinfile ../bin/article.grm example_ml.col > res.col

The file `../bin/article.grm' has the grammar and evolutionary model
that is used for the analysis. Don't worry about the stderr output.

The result is now in `res.col'. Every sequence is annotated with the
same common structure. The reliability of the prediction is given for
each for each position (as `certainty'). Positions with more that 25%
gaps are not predicted and have reliability 0. The structure is the
`align_bp' column and gives the alignment position of its pair.

If you want a dotplot, you can use:

  ../bin/scfg --treeinfile --ppfile example.pp ../bin/article.grm example_ml.col > res.col

Now the data for the dotplot is in `example.pp'. It can be drawn (as
postscript) using:

  ../bin/drawdot example.pp > dotplot.ps

There is a directory called `my_example' where you can try all this.

If you wanted to use you own newick tree, you could use:

  ../bin/newick2col my_own_tree.newick > my_own_tree.col

  ../bin/scfg --treefile my_own_tree.col ../bin/article.grm example.col > res.col

Or, you should probably maximum likelihood estimate its branch lengths
first:

  ../bin/newick2col my_own_tree.newick > my_own_tree.col

  cat example.col | ../bin/nohead >> my_own_tree.col

  ../bin/mltree ../bin/scfg.rate my_own_tree.col > my_own_tree_ml.col

  ../bin/scfg --treefile --ppfile my_own_tree_ml.col example.pp ../bin/article.grm example.col > res.col

Notice that options (--treefile and --ppfile) come first, then the
files they refer to (my_own_tree.col and example.pp). Actually, the
same could be done using:

  ../bin/scfg --treeinfile --ppfile example.pp ../bin/article.grm my_own_tree_ml.col > res.col

This is because we put the sequences into the file for the maximum
likelihood analysis.

We can visualize the results with parentheses using:

  ../bin/addparen res.col | ../bin/col2fasta > res.fasta

  ../bin/fasta2col res.fasta | ../bin/col2txtalign --space | less

If you want individual structure assignments, you can extend stems and
remove non-standard pairs using:

  ../bin/stdpair res.col | ../bin/extendstem | ../bin/addparen | ../bin/col2fasta > res.fasta

You can also make a postscript version of the alignment:

  cat res.fasta | ../bin/fasta2col | ../bin/col2psalign -pl --space > res.ps

In short, if you have a fasta file and want a structure prediction
with no hassles, you can write:

  ../bin/fasta2col example.fasta | sed 's/arbitrary/RNA/g' | ../bin/findphyl ../bin/scfg.rate example.col | ../bin/mltree ../bin/scfg.rate | ../bin/scfg --treeinfile ../bin/article.grm | ../bin/addparen | ../bin/col2fasta > res.fasta

There are many different options for all these programs and there are
also other programs, so if you have any specific wishes, let me know.
Also let me know if there are any problems.





