Each fam<#> directory contains the following files:

-<rna>.<#>.mfa: a set of windows of length 120 taken from a simulated short genomic sequence, which is created by adding dinucleotide-shuffled sequences of the original RNA gene <rna>.<#> in training.positive.stk to both ends of that gene

-<rna>_shuffled.<#>.mfa: a set of windows of length 120 taken from a simulated short genomic sequence, which is created by adding dinucleotide-shuffled sequences of the shuffled RNA gene <rna>_shuffled.<#> in training.negative.stk to both ends of that gene

-genome_shuffled.mfa: a set of windows of length 120 taken randomly from human chromosome 22 (GRCh38)

-<rna>.shuffles.gz: has the format "id1 id2 similarity identity shuffled_seq1 shuffled_seq2", whose sequences are created by aligning the sequences from the positive set and shuffling them while preserving the dinucleotide content, used to investigate the dependency of the score of shuffled version as a function of sequence identity

The other sets of windows of different length, which are used in Supplementary Material, are available upon request.
