
-------------------------------------------------------------------------------------------
[Programs and datasets]

    The directory 'deepBlockAlign_v*' contains following programs and datasets:

    1. deepBlockAlign.x: program to align two or more RNA-seq read block profiles
    2. deepBlockAlign.c: source code of deepBlockAlign
    3. plotdeepBlockAlign.pl: wrapper program to deepBlockAlign. Also output alignments in graphical format
    4. plotblockAlign.r: accessory R program to plotdeepBlockAlign.pl
    5. test_suite/human_eb.blockGroups: Benchmark dataset used in current study

-------------------------------------------------------------------------------------------
[Installation]

    To install deepBlockAlign, download deepBlockAlign.tar.gz and unpack it

    tar -zxvf deepBlockAlign.tar.gz

    A directory, deepBlockAlign will be created, now execute

    make or make all

    which compiles deepBlockAlign and creates an executable

    deepBlockAlign.x

-------------------------------------------------------------------------------------------
[Usage]

    deepBlockAlign.x is called with the following parameters:

    deepBlockAlign.x -q <query file> -s <subject file> [ARGUMENTS]

    plotdeepBlockAlign.pl is called with the following parameters:

    plotdeepBlockAlign.pl -i <query file> -j <subject file>

-------------------------------------------------------------------------------------------
[Example]

    An usage example of deepBlockAlign is shown below. Example Block groups shown in the manuscript are provided with the downloads along with complete benchmark dataset.

    $./deepBlockAlign.x -q testsuite/snoRNA-HACA-E3.example -s testsuite/hsa-mir-9-1.example
    # deepBlockAlign.x v1.3 started at Thu Dec  4 16:25:12 2014
    # query: testsuite/snoRNA-HACA-E3.example
    # subject: testsuite/hsa-mir-9-1.example
    # parameters (block group alignment)
    #  distance weight: 1.0; block similarity weight: 1.0; gap penalty: -1.0
    # parameters (block alignment)
    #  match: 1; mismatch: -1; threshold: 1; shape difference penalty: 1
    #  gap penalty (initialization): -2; gap penalty (extension): -1
    >cluster_929|E3|snoRNA_HACA|.|chr3:187987781-187987842(+)   >cluster_30|hsa-mir-9-1|miRNA|.|chr1:154656768-154656831(-) 0.554014

    An usage example of plotdeepBlockAlign.pl is shown below.

    $./plotdeepBlockAlign.pl -i testsuite/human_eb.blockGroups -j testsuite/human_eb.blockGroups -f cluster_176 -s cluster_531 -p .
    # plotdeepBlockAlign started at Thu, 04 Dec 2014 16:27:37 +0100
    # first block group: cluster_176
    # first file: testsuite/human_eb.blockGroups
    # second block group: cluster_531
    # second file: testsuite/human_eb.blockGroups
    computing alignment....done (score: 0.72)
    aligning blocks...done (cluster_176_n-a_1--cluster_531_hsa-mir-424_1.pdf,cluster_176_n-a_2--cluster_531_hsa-mir-424_2.pdf)
    aligning block groups...lsdone (cluster_176_n-a--cluster_531_hsa-mir-424.pdf)
    creating final output file...
    done (./cluster_176_n-a--cluster_531_hsa-mir-424_FINAL.pdf)

-------------------------------------------------------------------------------------------
[Input Data Format]

    deepBlockAlign takes the input data in the blockbuster output format. blockbuster is a program to define block groups based on the mapping pattern of short reads. A typical input comprises

    a) Header section (11 tab delimited fields)
        >cluster_929   chr3   187987781   187987842   +   306.50   194   2   E3   snoRNA   Intron

    b) Reads section (7 tab delimited fields)
        chr3   187987783   187987809   morin_EB_685629   0.500000   +   1
        chr3   187987783   187987806   morin_EB_685761   0.500000   +   1
        .
        .
        .
        chr3   187987783   187987806   morin_EB_685630   0.500000   +   1

    Important Note: The header section from blockbuster consists of eight fields (id, chr, start, end, strand, #reads, #tags, #blocks). Before providing it as an input to deepBlockAlign, three additional fields (name, annotation, locus) needs to be added. The information in these fields is required for most downstream analysis. If not available, user can put a "." in place of these fields.

    a) Header section (8 tab delimited fields)
        >cluster_929   chr3   187987781   187987842   +   306.50   194   2

    All the example files provided by us are already in 11 tab delimited fields format.

-------------------------------------------------------------------------------------------
[Details]

    For details, please refer to

    1. http://rth.dk/resources/dba/download.php
    2. http://bioinformatics.oxfordjournals.org/content/28/1/17.long (deepBlockAlign publication)
    3. http://bioinformatics.oxfordjournals.org/content/25/18/2298.full (blockbuster publication)

-------------------------------------------------------------------------------------------
[Contact]

    Sachin Pundhir: sachin@rth.dk
    David Langenberger: david@bioinfo.uni-leipzig.de

-------------------------------------------------------------------------------------------
[Citation]

Langenberger D*, Pundhir S*, Ekstrøm CT, Stadler PF, Hoffmann S and Gorodkin J.(2012) deepBlockAlign: a tool for aligning RNA-seq profiles of read block patterns. Bioinformatics. 28(1):17-24 [PMID: 22053076] (*joint first authors)
