#------------------------------------------------------------------------------#
#	Copyright				May,    	1998	       #
#	Burkhard Rost		rost@LION-ag.de 			       #
#	Wilckensstr. 15		http://www.embl-heidelberg.de/~rost/	       #
#	D-69120 Heidelberg						       #
#------------------------------------------------------------------------------#

# ============================================================================ #
# PHD options, and their explanations
# ============================================================================ #
#
# 
# ------------------------------------------------------------------------------
# (0) Table of Contents
# ------------------------------------------------------------------------------
# 
# (1) How to choose parameters on the command line ?
# (2) Parameters any user may wish to use
# (3) Parameters experienced users may want to use
# (4) Shortcuts (abbreviations) for parameters
# (5) Parameters usually not of interest
# (6) Parameters mainly for setting up PHD
# (7) Parameters better to ignore...
# 
# 
# ------------------------------------------------------------------------------
# (1) How to choose parameters on the command line ?
# ------------------------------------------------------------------------------
# 
# You can set all options/parameters set on command line by:
#
#	'keyword=value'                    (e.g. 'phd file.hssp nresPerRow=100')
#
# Additionally, there are a couple of options for which particular keywords  are
# understood directly (do 'phd help' to find out more about those). For example:
# 
#       'phd file.hssp sec' 
# 
#			-> run only the secondary structure prediction
# Syntax of this file:
# 
# keyword		meaning
# 
# 
# ------------------------------------------------------------------------------
# (2) Parameters any user may wish to use
# ------------------------------------------------------------------------------
# 
#			--------------------------------------------------------
#                       system stuff
#			--------------------------------------------------------
ARCH			specifies the CPU architecture, allowed: 
			      ALPHA|SGI64|SGI5|SUNMP
			NOTE: you can also provide this in your environment  by:
			      'setenv ARCH ALPHA'
# 
#			--------------------------------------------------------
#                       run options
#			--------------------------------------------------------
# 
acc			as any command line argument will run only PHDacc
htm			as any command line argument will run only PHDhtm
sec			as any command line argument will run only PHDsec
both			as any command line argument will run only PHDsec+PHDacc

optPhd			specifies which PHD program to run, allowed: 
			      3|sec|acc|htm|both=sec+acc
# 
# 
# ------------------------------------------------------------------------------
# (3) Parameters experienced users may want to use
# ------------------------------------------------------------------------------
# 
# 
#			--------------------------------------------------------
#                       PHD run options
#			--------------------------------------------------------
# 
optRdb                  =0 implies that no RDB file is written
optPara                 =0 implies that the  parameter  file is not passed as an
			      argument.
			NOTE: to specify parameter files see   'paraAcc|Sec|Htm'
optNice                 setting lower priority to jobs, syntax: optNice=nice-15
			NOTE: no blank, just one string
optPdbid                =PDBid will result in that a neural network is used that
			      did NOT  use  the  specified protein for training.
			NOTE: This  corresponds to a cross-validated prediction.
optHtmisitMin		qualify as notHTM if best transmembrane helix  score  is
			      lower than this value.
			NOTE: expected rates of false positives (number of glo-
			      bular proteins predicted with membrane segments),
			      and expected rate of membrane proteins missed:
			0.7 -> false positives < 4%, missed < 1%
			0.8 -> false positives < 2%, missed < 3%
# 
cross			as any command line argument will run PHD in cross-vali-
			      dation mode, i.e.,  such  that neural networks are
			      used that were NOT trained on the structural know-
			      ledge about your protein (for test purposes).
optIsCross		=1, use cross-validation analysis,  i.e.,  NOT  the most
			      accurate prediction method,  but the one that used
			      no information about the structure of your protein
			      during training.
optDoHtmfil             if 1, the HTM prediction is filtered by a 'rule-of-thumb'
			      method
			NOTE: also invoked by keywords 'notHtmfil' or 'doHtmfil'
optDoHtmisit            if 1, the best predicted HTM is checked and evaluated as
			      to the likelihood of your  protein  containing any
			      membrane segment, at all.
			NOTE: also invoked by keywords 'notHtmisit or 'doHtmisit'
			NOTE2:for further details see 'optHtmisitMin'
optDoHtmref             if 1, PHDhtm is refined
optDoHtmtop             if 1, the membrane topology (PHDtopology) is predicted
# 
isTest			use the keyword 'test' or 'isTest=1'  for  fast tests of
			       PHD (not the best prediciton returned !)
# 
#                       --------------------------------------------------------
#                       massaging input file
#                       --------------------------------------------------------
# 
doFilterHssp		=1, filter the HSSP file for PHDacc by formula +5  (30%)
			NOTE: the value is specified by 'filterHsspVal'
			NOTE2:also invoked by simply using 'filter'
keepFilterHssp		as keyword  will result  in that the  filtered HSSP file
			      will not be deleted
optFilterHssp           provides the  directives of how to filter the input file
			      For example:
			'thresh=5 excl=5-10,15 red=70'
			      would filter with the HSSP-threshold +5 (i.e. 30%
			      pairwise sequence identity  for alignments longer
			      than 80 residues), exclude alignments 5-10 and 15
			      and exclude all pairs with more than 70% sequence
			      identity.
                        The following options are available  (see hssp_filter.pl
			for further explanations):
			thresh=x     alignments will be cut-off at  pairwise se-
				     quence identity < FORMULA + X %
			excl=n1-n2   sequences n1 to n2 will be excluded,  e.g.:
				     n1-*, n1-n2, or: n1,n5,... 
			incl=m1-m2   sequences m1 to m2 will be included,  e.g.:
				     m1-*, m1-m2, or: m1,m5,... 
			red=x        reduce redundancy   (purge all with ide<=x) 
			clean        delete mutually identical sequences 
			minIde=	     minimal levels for NEW identity    (ask BR)
			minSim=      minimal levels for NEW similarity  (ask BR)
			min=         minimal levels for both previous e (ask BR)
			maxIde=	     maximal levels for NEW identity    (ask BR)
			maxSim=      maximal levels for NEW similarity  (ask BR)
			max=         maximal levels for both previous e (ask BR)
			thresh=ide-5 threshold chosen by new curves (ide-5=>20%) 
			thresh=sim+10  dito for similarity
			thresh=rule5 all with ide<5 (30%) will be included  only
				     if similarity > identity 
filterHsspVal		if the parameter  'doFilterHssp' is set to 1,  the input
			      alignment will be  filtered  such that no sequence
			      will be used for prediction for which the distance
			      to the HSSP threshold (see manual) is below
			      25 + filterHsspVal percent.
filterHsspMetric	metric used for converting MSF -> HSSP and for filtering
			NOTE: also used for  predicting structure given a sparse
			      alignment.
keepConvertHssp		as keyword will  result in that the filtered  HSSP  file
			      will not be deleted
# 
#                       --------------------------------------------------------
#                       PHD output options
#                       --------------------------------------------------------
# 
nresPerRow		number of residues per line in human readable  output of
			      PHD (.phd).
# 
doRetHeader             =1,   return text describing PHD accuracy
doRetAli		=1,   return DSSP formatted prediction 
			NOTE: also invoked by keyword 'dssp'
doRetAli		=1,   return MSF formatted prediction 
			NOTE: also invoked by keyword 'msf'
doRetAliExpand          =1,   expand the  insertions  and  deletions  in the MSF 
			      formatted PHD prediction returned.
			NOTE: also invoked by keyword 'expand
formatRetAli		format of output file merging alignment and prediction
			NOTE: default = SAF (also possible 'msf')
nresPerRowAli		number of residues per line in MSF|SAF output file 
# 
optDoEval               include the analysis of accuracy if DSSP is known, i.e.,
                              if your protein has a known structure.
			NOTE: currently working only for HSSP input files
optPhd2msf		how to convert PHD results to MSF format.      Possible:
  		        'optPhd2msf=0|1|expand'
optHssp2msf		converting the  input  HSSP file to  MSF.   For example:
			'ARCH=SGI64 exe=convert_seq.SGI5 expand'
			NOTE: add "expand " to expand insertions for HSSP -> MSF
# 
doRetDssp		convert PHD.rdb into DSSP formatted  file to be used for
			      the prediction based threading TOPITS
# 
#			--------------------------------------------------------
#                       output file names
#			--------------------------------------------------------
# 
title			output files will be named $dirWork.$title.$ext, see ext
jobid			process id used as label for intermediate files
chain			protein chain, used for running PHD only on one chain in
			      an HSSP file
# 
fileOutPhd		name of file with final human readable PHD output
fileOutRdb		name of file with final PHD output in RDB format
fileOutRdbHtm		name of file with final PHDhtm output in RDB format
			NOTE: for  'optPhd=htm'  this file will be  forced to be
			      identical to 'fileOutRdb'
fileNotHtm		name of file  flagging  that no membrane helix was found
			      given the  threshold defined  by  'optHtmisitMin'.
fileOutAli		name of file with PHD result in MSF or SAF format
			NOTE: only if 'doRetAli=1' and
fileOutDssp		name of file with PHD result in MSF format

filePhd			short for fileOutPhd
fileRdb			short for fileOutRdb
fileDssp		short for fileOutDssp
fileAli			short for fileOutAli

# 
#			--------------------------------------------------------
#                       output file names: job management
#			--------------------------------------------------------
# 
fileOutScreen		file dumping output from sys call
fileOutTrace		LOG file tracing some of the PHD output
titleTmp		title for temporary files (will be named 'titleTmp-xyz')
# 
#                       --------------------------------------------------------
#                       general job management
#                       --------------------------------------------------------
# 
verbose			=1 will write some details onto screen
verb2			=1 will write even more details onto the screen
verb3			=1 will write still more details onto the screen
# 
debug                   =1 will result in that most intermediate  files will not
			      be deleted automatically.
silent			as  any  argument in the  command line will surpress all
			      output written onto the screen.
			NOTE: corresponds to verbose=0, verb2=0, verb3=0.
# 
# ------------------------------------------------------------------------------
# (4) Shortcuts (abbreviations) for parameters
# ------------------------------------------------------------------------------
# 
#			--------------------------------------------------------
#                       PHD run options
#			--------------------------------------------------------
# 
3			alias for 'optPhd=3'	(run PHDsec + PHDacc + PHDhtm)
acc			alias for 'optPhd=acc'
both			alias for 'optPhd=both' (run PHDsec + PHDacc)
htm			alias for 'optPhd=htm'
sec			alias for 'optPhd=sec'
#
skipOld			alias for 'doNew=0'     (i.e. no job if results exist,
						 already.   Note: for running
						 PHD on file lists!)
#
nice-n			alias for 'optNice=N'   (N=1..19, with 1= low priority)
nonice			alias for 'optNice=0'   (high priority for job)
debug			alias for 'debug=1'
# 
cross			alias for 'optIsCross=1'
pdbid=ID		alias for 'optPdbid=ID'
#
test			alias for 'isTest=1'
# 
notHtmisit		alias for 'optDoHtmisit=0'
notHtmfil		alias for 'optDoHtmfil=0'
notHtmref		alias for 'optDoHtmref=0'
notHtmtop		alias for 'optDoHtmtop=0'
notHtmisit		alias for 'optDoHtmisit=0'
noPhdHead		alias for 'doRetHeader=0'
#
filter			alias for 'doFilterHssp=1'
# 
#                       --------------------------------------------------------
#                       PHD output options
#                       --------------------------------------------------------
# 
dssp			alias for 'doRetDssp=1'
msf			alias for 'doRetAli=1 formatRetAli=msf'
saf			alias for 'doRetAli=1 formatRetAli=saf'
expand			alias for 'doRetAli=1 doRetAliExpand=1'
# 
rdb2pred		alias for 'doPrepeval=1 doRdb2pred=1'
keepConv		alias for 'keepConvertHssp=1'
keepFilter		alias for 'keepFilterHssp=1'
# 
# ------------------------------------------------------------------------------
# (5) Parameters usually not of interest
# ------------------------------------------------------------------------------
# 
#                       --------------------------------------------------------
#                       parameter files: lists of cross-validation experiments
#                       --------------------------------------------------------
# 
fileListTrain		list of PDB proteins used for training
fileListTrainSecAcc	list of PDB proteins used for training PHDacc and PHDsec
fileListTrainHtm	list of SWISS-PROT proteins used for training PHDhtm
# 
#                       --------------------------------------------------------
#                       phd parameters (giving architecture names)
#                       --------------------------------------------------------
# 
paraAcc			file specifying  which  neural networks to use for final 
			      jury decision of PHDacc
paraSec			file specifying  which  neural networks to use for final 
			      jury decision of PHDsec
paraHtm			file specifying  which  neural networks to use for final 
			      jury decision of PHDhtm
#
#                       --------------------------------------------------------
#                       running PHD on a file list
#                       --------------------------------------------------------
# 
doNew			generate new PHD files even if old ones exist. If set to
			      = 0,  PHD is not run if an old result file exists,
			      already! 
			NOTE: this option is for running PHD on a list of files
# 
doPrepeval		EVALSEC/EVALACC are programs evaluating prediction acc-
			      uracy for proteins of known 3D structure.  If you
			      set 'doPrepeval=1'  when running PHD on a list of
			      proteins, a final  list of the result  RDB  files
			      will be written. 
			NOTE: see 'doPrepevalSort' AND 'doRdb2pred'      
doPrepevalSort		final list of all RDB files written will be sorted by 
			      the PDB identifier ignoring version numbers:
			      e.g. '1aa0,4aah,1bgl'
doRdb2pred		all results are appended in a long file readable by  the
			      programs EVALSEC/EVALACC for evaluating prediction
			      accuracy for proteins of known structure.
# 
#                       --------------------------------------------------------
#                       output options
#                       --------------------------------------------------------
# 
riSubSec                the human readable PHD output (.phd) adds a row with the
                              subset of more reliably predicted residues.  These
			      will be all that have a higher  reliability  index
			      than 'riSubSec'.
riSubAcc		the human readable PHD output (.phd) adds a row with the
			      subset of more reliably predicted residues.  These
			      will be all that have a higher  reliability  index 
			      than 'riSubAcc.
riSubHtm		the human readable PHD output (.phd) adds a row with the
			      subset of more reliably predicted residues.  These
			      will be all that have a higher  reliability  index 
			      than 'riSubHtm.
riSubSym		symbol used for presenting less reliably predicted resi-
			      dues in the row 'SUB sec|acc|htm'.
# 
#			--------------------------------------------------------
#                       general file directories
#			--------------------------------------------------------
# 
dirIn			directory of input files
dirOut			directory of output files
dirWork			working directory, for intermediate files
# 
#			--------------------------------------------------------
#			file extensions
#			--------------------------------------------------------
# 
extHssp			extension for input HSSP file
extDssp			extension for input DSSP file
extMsf			extension for input MSF file
extPhd			extension for human readable PHD results
extRdb			extension for RDB formatted PHD results
extPhdMsf		extension for PHD result in format
extPhdDssp		extension for PHD result in format
extNotHtm		extension for file flagging that no transmembrane helix
			      was  found above the  'optHtmisitMin'  threshold.
extFasta		extension of FASTA input files
extPhdSaf		extension of output PHD.saf files
extRdbHtm		extension of output PHD.rdb_for_htm files
extSaf			extension of SAF input files
# 
# 
# ------------------------------------------------------------------------------
# (6) Parameters mainly for setting up PHD
# ------------------------------------------------------------------------------
# 
#			--------------------------------------------------------
#			PHD directories
#			--------------------------------------------------------
# 
dirHome			directory in which the subtree 'phd/' is expected
dirPhd			directory of PHD, ending with '/'
dirLib			directory of perl libraries, ending with '/'
dirPhdScr		directory of perl scripts, ending with '/'
dirPhdBin		directory of machine specific binaries, ending with '/'
dirPhdNet		directory of neural networks, ending with '/'
dirPhdNetCross		directory of nets for cross-validation, ending with '/'
dirPhdPara		directory of parmater files, ending with '/'
dirPhdParaCross		directory of parameter files for cross-validation
dirPhdMat		directory of various text files, ending with '/'
dirPhdPack		directory of perl packages, ending with '/'
# 
#                       --------------------------------------------------------
#                       Protein database directories
#                       --------------------------------------------------------
# 
dirData			directory for databases, in general, ending with '/'
dirHssp			directory of HSSP database, ending with '/'
dirDssp			directory of DSSP database, ending with '/'
dirSwiss		directory of SWISS-PROT db, ending with '/'
# 
#                       --------------------------------------------------------
#                       MaxHom material directories
#                       --------------------------------------------------------
# 
dirMax			directory of MaxHom stuff, ending with '/'
dirMaxMat		directory of MaxHom matrices, ending with '/'
# 
# 
#                       --------------------------------------------------------
#                       binaries (fortran)
#                       --------------------------------------------------------
# 
exeHsspFilter		FORTRAN executable for filtering HSSP input
exeConvertSeq		FORTRAN executable for converting format of input
exePhd			FORTRAN executable for PHD
# 
#                       --------------------------------------------------------
#                       perl scripts (PHD)
#                       --------------------------------------------------------
# 
exeHtmfil		PERL exectuable for running the PHDhtm filter
exeHtmref		PERL exectuable for running the PHDhtm refinement
exeHtmtop		PERL exectuable for running the PHDtopology prediction
exeHtmisit		PERL exectuable for distinguishing  membrane from globu-
			      lar proteins
# 
#                       --------------------------------------------------------
#                       perl scripts (tools)
#                       --------------------------------------------------------
# 
exeCopf			PERL executable  for  converting protein databse formats
			      type  'copf help' to learn more about this program
exeConvHssp2saf		PERL executable for converting HSSP to SAF files
			      type  'conv_hssp2saf.pl help'  to learn more about
			      this  program
exeHsspFilterPl		PERL executable for filtering HSSP input
# 
# 
#exeHssp2msf		PERL executable for converting HSSP to MSF
exePhd2msf		PERL executable for converting PHD.rdb to MSF
exePhd2dssp		PERL executable for converting PHD.rdb to DSSP
exeRdb2pred		PERL executable for converting PHD.rdb to *.predrel form
exePdbidSort		PERL executable for sorting 
# 
#                       --------------------------------------------------------
#                       perl libraries
#                       --------------------------------------------------------
#
exeLibBr		PERL library
exeLibPhd		PERL library
exeLibUt		PERL library
# 
#                       --------------------------------------------------------
#                       file material: help stuff
#                       --------------------------------------------------------
# 
fileHelpMan		PHD manual
fileHelpOpt		this file
# 
#                       --------------------------------------------------------
#                       file material: headers (accuracy tables)
#                       --------------------------------------------------------
# 
headPhd3		header giving accuracy tables for  PHDsec, PHDacc+PHDhtm
headPhdBoth		header giving accuracy tables for PHDsec and PHDacc
headPhdConcise		short header giving accuracy tables for PHD
headPhdAcc		header giving accuracy tables for PHDacc
headPhdHtm		header giving accuracy tables for PHDhtm
headPhdSec		header giving accuracy tables for PHDsec
# 
#                       --------------------------------------------------------
#                       file material: abbreviations
#                       --------------------------------------------------------
# 
abbrPhd3		explanation of abbreviations for PHD
abbrPhdBoth		explanation of abbreviations for PHDsec and PHDacc
abbrPhdAcc		explanation of abbreviations for PHDacc
abbrPhdHtm		explanation of abbreviations for PHDhtm
abbrPhdSec		explanation of abbreviations for PHDsec
abbrPhdRdb		explanation of abbreviations for PHD RDB format
# 
#			--------------------------------------------------------
#			further parameters: system
#			--------------------------------------------------------
# 
USER			specifies the user name 
			NOTE: you can also provide this in your environment  by:
			      'setenv USER ALPHA'
			NOTE 2: usually this variable is set in UNIX
# 
# ------------------------------------------------------------------------------
# (7) Parameters better to ignore...
# ------------------------------------------------------------------------------
# 
optIsDec		for special use, only: should be one for DEC alpha
optKg			for special use, only
optMach			for special use, only
optUserPhd		for special use, only
# 
formatInput		set on flight
optHtmfin		set on flight
paraAccCross		set on flight
paraHtmCross		set on flight
paraSecCross		set on flight
# 
#			--------------------------------------------------------

# ------------------------------------------------------------------------------



