RNA Structure Logo
The structure logo is an extension of the sequence logo by Schneider and
Stephens. We have extended the standard sequence logo to cope with any prior
nucleotide distribution as well as allowing for gaps in the alignments, and
indicate mutual information of basepaired positions in RNA. Thus the logo is
composed by a sequence part and a structure/basepair part. The total height of
the sequence information part is computed as the relative entropy between the
observed fractions of a given symbol and the respective a priori
probabilities, with the constraint that the a priori ``probability'' of the
gap always is one. The a priori probabilities for the nucleotides sum to
one. Note that this might lead to negative ``information'' if sufficiently
many gaps are present at a given position. The height of each symbol can be
displayed in two ways: ``type 1 logo'' where the height is proportional to its
frequency, or ``type 2 logo'' where the height is in proportion to the
fraction of the observed frequency and the expected (a priori) frequency. In
both cases, when a symbol appears less than expected the symbol will be
displayed up-side-down. The mutual information between indicated (given from
the structural alignment) basepair is symmetric so each involved position do
Only obtain half of the shared mutual information. The mutual information is
calculated as the relative entropy between the fractions of complementary
bases at indicated basepaired positions in the alignment and the number of
basepairs one would expect by chance from the distribution of nucleotides at
the involved positions. Also a similar
page for
proteins has been constructed. See the paper below for further details.
You can get the
scripts here or you can
``
click in'' your alignment below.
However the script has been written such that it is possible to use position
wise background distribution. For usage of the
structure logos please quote
|
T. D. Schneider and R. M. Stephens. Sequence logos: a new way to
display consensus sequences. Nucleic Acids Research, Vol. 18, no. 20,
pp 6097-6100, 1990. (Also check out
Tom Schneiders
page.)
|
Illustration:
The top logo is a plain
sequence logo. The bottom logo includes the mutual information for the
respective basepair regions. The letter ``M'' Indicates the amount of mutual
information for the corresponding basepairings at that position. (Data is a
from Tuerk
et. al. PNAS 89, pp 6988-6992, 1992.)
You can also ``click in'' your structural alignment below. The final logo in
postscript can then be downloaded. You can see an example of the
data
format here . There are some options you should set. You are welcome to
send your comments or bug reports to
webmaster@rth.dk. The values shown are defaults value. The a priori
probabilities for nucleotides must be sharply greater than zero and less than
one. Entering one line of probabilities result in the same background
distribution to be used throughout the alignment. Alternatively enter as many
lines as there are positions in the alignment, corresponding to a position
wise background distribution of nucleotides.
The basepair ``probabilities'' may in principle take any value, however a
natural interpretation of the
weighted base complementarity is as
probabilities, which also controls the relative scaling between sequence
information and mutual structure information. For the version here we only
allow basepair probabilities for AU (UA), CG (GC), and GU (UG).