###########################################################################
# findgoodregion-0.2                                                      #
# #######################                                                 #
#                                                                         #
# A heuristic for identifying useful flanking regions from RNAcop output  #
#                                                                         #
#    Copyright (C) 2015  Nikolai Hecker                                   #
#                                                                         #
#   This program is free software: you can redistribute it and/or modify  #
#   it under the terms of the GNU General Public License as published by  #
#   the Free Software Foundation, either version 3 of the License, or     #
#   (at your option) any later version.                                   #
#                                                                         #
#   This program is distributed in the hope that it will be useful,       #
#   but WITHOUT ANY WARRANTY; without even the implied warranty of        #
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         #
#   GNU General Public License for more details.                          #
#                                                                         #
#   You should have received a copy of the GNU General Public License     #
#   along with this program.  If not, see <http://www.gnu.org/licenses/>. #
#                                                                         #
###########################################################################

'The findgoodregion' package contains the tool 'findgoodregion' for identifying 
suitable flanking regions based on 'rank_flanks.pl' output which is part of
the RNAcop utility package. For test purposes this package contains another 
program which has the aim to identify plateau like areas, 'findplateau'. 
'findgoodregion' uses a heuristic based on a dis-joint set structure to identify
sqaure areas which contain values with difference in log10-probability 
(log fold-change) of equal to or less than as specified threshold to the maximum
probability of any flanking region combination in the specified window.


PREREQUISITES
#############
* Linux operating system or similar
* a working version of GCC (GCC version 4.6.3 or higher recommended)
* GNU C library: getopt
* GNU scientific library (GSL)


INSTALLATION
############
Simply change into the 'findgoodregion-0.1' parent folder and type 'make':

$ make

If you want to use findgoodregion globally add the 
'findgoodregion-0.2' parent folder to your PATH environ variable.
For example if you are using BASH, edit '~/.bashrc' and add following line:

PATH=$PATH:/home/username/findgoodregion-0.2/;export PATH

if you have unpacked 'findgoodregion' in /home/username/.


RUNNING 'findgoodregion'
#########################
'findgoodregion' takes '*rddG.tsv' ('rank_flanks.pl' output) as input. The 
ddG values are converted into log10-probabilities.

USAGE:

i|input         Ranked flanking region ddG file: 'rank_flanks.pl' output file

o|output        Path for output TSV file

s|setcoverage   Based on the heuristic, first, dis-joint sets are identified.
		Next progressively the largest square areas for each dis-joint
		set are calculated. This value specifies the minimum ratio
		of values which has to be covered by square areas in each
		dis-joint set.

t|threshold	Maximum log10 fold-change to maximum log10 probability within
		the user-specified window for considering flanking regions.
		We suggest to use a threshold between 0.5 and 1.0
		which refers to ~3-fold to 10-fold change to maximum 
		probability.

d|maxradius	To stop square areas from consisting of the entire matrix
 		this value limits how much each square area can be extended

x|maxi		The maximum flanking region size allowed in 5' direction 

y|maxj		The maximum flanking region size allowed in 3' direction 

l|mini		The minimum flanking region size allowed in 5' direction 

r|minj		The minimum flanking region size allowed in 3' direction 

You might want to sort 'findgoodregion' output by 'area' (the tolerance in 5' and 3' direction).
 
For an example how to run 'findgoodregion' have a look at the RNAcop-bundle or try following example:

$ ./findgoodregion -i example/RF01754_AAKW01000063.1_rddG.tsv -o goodregion_RF01754_AAKW01000063.1.tsv


VERSION HISTORY
###############

* findgoodregion-0.2
- fixed no-output issue when only optimal solution is identified
- added example file
- changed default parameters

* findgoodregion-0.1: initial release
