jointsnvmix: joint_snv_mix.xml comparison

comparison joint_snv_mix.xml @ 0:a1034918ab9b draft

Uploaded

author	fcaramia
date	Thu, 20 Jun 2013 00:03:08 -0400
parents
children	26953f1c8af2

comparison

equal deleted inserted replaced

--1:000000000000
+:a1034918ab9b
+<tool id="joint_snv_mix" name="Joint SNV Mix" version="0.7.5">
+<description>classify germline and somatic mutations</description>
+<requirements>
+	<requirement type="package" version="2.7">python</requirement>
+	<requirement type="package" version="0.19.1">cython</requirement>
+	<requirement type="package" version="0.5">pysam</requirement>
+	<requirement type="package" version="0.1.18">samtools</requirement>
+	<requirement type="package" version="0.7.5">jointsnvmix</requirement>
+</requirements>
+<command interpreter="perl">
+	joint_snv_mix.pl
+	"ACTION::${option.option}"
+	"REFGENOME::$refFile.fields.path"
+	"BAMNORMAL::$normal_file"
+	"BAMTUMOR::$tumor_file"
+	#if str($option.option) == "classify":
+		#if ($option.parameters):
+		  	"OPTION::--parameters_file $option.parameters"
+	  	#end if
+		"OPTION::--out_file $output"
+		"OPTION::--somatic_threshold $option.somatic_threshold"
+	#end if
+	#if str($option.option) == "train":
+		#if ($option.priors):
+			"OPTION::--priors_file $option.priors"
+		#end if
+		"OUTPUT::$output"
+		"OPTION::--convergence_threshold $option.convergence_threshold"
+		"OPTION::--max_iters $option.max_iters"
+	#end if
+	#if ($positions_file):
+		"OPTION::--positions_file $positions_file"
+	#end if
+	"OPTION::--min_base_qual $min_base_quality"
+	"OPTION::--min_map_qual $min_map_quality"
+	"OPTION::--model $model"
+	#if ($chromosome):
+		"OPTION::--chromosome $chromosome"
+	#end if
+</command>
+<inputs>
+	<param name="refFile" type="select" label="Select a reference genome" optional="false">
+		<options from_data_table="all_fasta">
+			<filter type="sort_by" column="2" />
+			<validator type="no_options" message="No indexes are available" />
+		</options>
+	</param>
+	<param name="normal_file" type="data" format="bam" label="Normal Sample " help="Bam" />
+	<param name="tumor_file" type="data" format="bam" label="Tumor Sample" help="Bam" />
+	<param name="model" type="select" label="Model" help="" optional="true">
+			<option value="binomial">binomial</option>
+			<option value="snvmix2" selected="true">snvmix2</option>
+			<option value="beta_binomial">beta binomial</option>
+	</param>
+	<param name="positions_file" type="data" format="txt" label="Positions file" help="Filter positions" optional="true"/>
+	<param name="min_map_quality" type="text" label="Min map quality" help="Filter reads" value="0"/>
+	<param name="min_base_quality" type="text" label="Min base quality" help="Filter reads" value="0"/>
+	<param name="chromosome" type="text" label="Chromosome" help="a chromosome to analyse, leave blank for all"/>
+	<conditional name="option">
+		<param name="option" type="select" label="Action" help="" optional="true">
+			<option value="train" selected="true">Train</option>
+			<option value="classify">Classify</option>
+		</param>
+		<when value="train">
+			<param name="priors" type="data" format="txt" label="Prior Probabilities" optional="true"/>
+			<param name="initial_parameters" type="data" format="txt" label="Initial Parameters" optional="true"/>
+			<param name="convergence_threshold" type="text" label="Convergence Threshold"  value="1e-6"/>
+			<param name="max_iters" type="text" label="Max number of training iterations"  value="1000"/>
+		</when>
+		<when value="classify">
+			<param name="parameters" type="data" format="txt" label="Classify Parameters" help="" optional="true" />
+			<param name="somatic_threshold" type="text" label="Somatic Threshold" help="filter by probability" value="0.0"/>
+		</when>
+	</conditional>
+</inputs>
+<outputs>
+	<data type="data" format="txt" name="output" label="${tool.name} result on ${on_string}"/>
+</outputs>
+<help>
+.. class:: infomark
+**What it does**
+::
+JointSNVMix implements a probabilistic graphical model to analyse sequence data
+from tumour/normal pairs. The model draws statistical strength by analysing both
+genome jointly to more accurately classify germline and somatic mutations.
+Train
+The SnvMix family of models are complete generative models of the data.
+As such the model parameters can be learned using the Expectation Maximisation
+(EM) algorithm. The train command allows this to be done.
+All methods require that a file with the parameters for the prior densities,
+and an initial set of parameters be passed in. Templates for these files can
+be found in the config/ directory which ships with the package. If you are
+unsure about setting the   priors or parameter values these files should suffice.
+The train command will produce a parameters file suitable for use with the
+classification command. Training is highly recommended to achieve optimal
+performance when using SnvMix based model.
+To reduce memory consumption all subcommands of train take an optional --skip-size flag.
+This is the number of positions to skip over before sampling a position for the training set.
+Smaller values will lead to larger training sets which will require more memory,
+but should yield better parameter estimates.
+All subcommands of train also take optional parameters for minimum depth a
+position has in the tumour and normal to be used for training. Higher depth
+sites should give more robust estimates of the parameters. The default values
+of these are likely fine.
+Classify
+The classify command is used for analysing tumour/normal paired data and
+computing the posterior probability for each of the nine joint genotypes for
+a pair of diploid genomes.
+**Models**
+::
+There are currently three models supported by both the train and classify commands.
+All models use the JointSNVMix mixture model which jointly analyses the normal and tumour genomes.
+By default snvmix2 is used but other models can be specified.
+binomial
+Uses binomial densities in the mixture model this was previously referred to as the JointSnvMix1 mode.
+snvmix2
+Uses snvmix2 densities in the mixture as described in the original SNVMix paper previously referred to as JointSnvMix2.
+beta_binomial
+Uses beta-binomial densities in the mixture model new in version 0.8. The beta-binomial is a robust (in the statistical sense)
+alternative to binomial model. It can be beneficial when dealing with over-dispersed data. This is useful in cancer genomes
+since allelic frequencies at somatic mutations sites may deviate significantly from those expected under diploid model.
+**Input**
+Bam files containing normal and tumor reads.
+**Parameters**
+Classify
+chromosome CHROMOSOME
+Chromosome to analyse. If not set all chromosomes will
+be analysed.
+min_base_qual MIN_BASE_QUAL
+Remove bases with base quality lower than this.
+Default is 0.
+min_map_qual MIN_MAP_QUAL
+Remove bases with mapping quality lower than this.
+Default is 0.
+positions_file POSITIONS_FILE
+Path to a file containing a list of positions to
+create use for analysis. Should be space separated
+chrom pos. Additionally for each chromosome the
+positions should be sorted. The same format as
+samtools.
+parameters_file PARAMETERS_FILE
+Path to a file with custom parameters values for the
+model.
+somatic_threshold SOMATIC_THRESHOLD
+Only sites with P(Somatic) = p_AA_AB + p_AA_BB greater
+than equal this value will be printed. Default is 0.
+Train
+chromosome CHROMOSOME
+Chromosome to analyse. If not set all chromosomes will
+be analysed.
+min_base_qual MIN_BASE_QUAL
+Remove bases with base quality lower than this.
+Default is 0.
+min_map_qual MIN_MAP_QUAL
+Remove bases with mapping quality lower than this.
+Default is 0.
+positions_file POSITIONS_FILE
+Path to a file containing a list of positions to
+create use for analysis. Should be space separated
+chrom pos. Additionally for each chromosome the
+positions should be sorted. The same format as
+samtools.
+priors_file PRIORS_FILE
+Path to a file with priors for the model parameters.
+initial_parameters_file INITIAL_PARAMETERS_FILE
+Path to a file with initial parameter values for the
+model.
+min_normal_depth MIN_NORMAL_DEPTH
+Minimum depth of coverage in normal sample for a site
+to be eligible for use in training set. Default 10
+min_tumour_depth MIN_TUMOUR_DEPTH
+Minimum depth of coverage in tumour sample for a site
+to be eligible for use in training set. Default 10
+max_normal_depth MAX_NORMAL_DEPTH
+Maximum depth of coverage in normal sample for a site
+to be eligible for use in training set. Default 100
+max_tumour_depth MAX_TUMOUR_DEPTH
+Maximum depth of coverage in tumour sample for a site
+to be eligible for use in training set. Default 100
+max_iters MAX_ITERS
+Maximum number of iterations to used for training
+model. Default 1000
+skip_size SKIP_SIZE
+When subsampling will skip over this number of
+position before adding a site to the subsample. Larger
+values lead to smaller subsample data sets with faster
+training and less memory. Smaller values should lead
+to better parameter estimates. Default 1.
+convergence_threshold CONVERGENCE_THRESHOLD
+Convergence threshold for EM training. Once the change
+in objective function is below this value training
+will end. Default 1e-6
+</help>
+</tool>

Mercurial > repos > fcaramia > jointsnvmix

comparison joint_snv_mix.xml @ 0:a1034918ab9b draft