view snpSift_caseControl.xml @ 2:a9bae7957c36

Fix typo on caseControl in commandline
author Jim Johnson <jj@umn.edu>
date Wed, 30 Jan 2013 17:29:10 -0600
parents 2c595fea585c
children 192a236898f5
line wrap: on
line source

<tool id="snpSift_caseControl" name="SnpSift CaseControl" version="3.1">
	<description>Count samples are in 'case' and 'control' groups.</description>
	<!-- 
	    You will need to change the path to wherever your installation is.
		You can change the amount of memory used, just change the -Xmx parameter (e.g. use -Xmx2G for 2Gb of memory)
	-->
	<requirements>
                <requirement type="package" version="3.1">snpEff</requirement>
	</requirements>
	<command>
		java -Xmx1G -jar \$JAVA_JAR_PATH/SnpSift.jar caseControl -q $hhCase $hhControl '$caseControStr' $input > $output
	</command>
	<inputs>
		<param format="vcf" name="input" type="data" label="VCF input"/>
		<param name="hhCase" type="select" label="Hom/Het case">
			<option value="any">Any</option>
			<option value="hom">Homozygous</option>
			<option value="het">Heterozygous</option>
		</param>
		<param name="hhControl" type="select" label="Hom/Het control">
			<option value="any">Any</option>
			<option value="hom">Homozygous</option>
			<option value="het">Heterozygous</option>
		</param>
		<param name="caseControStr" type="text" label="Case / Control column designation" size="50">
		<help>
Case and control are defined by a string containing plus and minus symbols {'+', '-', '0'} where '+' is case, '-' is control and '0' is neutral
		</help>
		<validator type="regex" message="must be  only plus(+), minus(-), or zero(0) characters">[+-0]+</validator>
                </param>
	</inputs>
	<outputs>
		<data format="vcf" name="output" />
	</outputs>
        <stdio>
          <exit_code range=":-1"  level="fatal"   description="Error: Cannot open file" />
          <exit_code range="1:"  level="fatal"   description="Error" />
        </stdio>

	<help>

**SnpSift CaseControl**

Allows you to count how many samples are in 'case' group and a 'control' group. You can count 'homozygous', 'heterozygous' or 'any' variants. 

Case and control are defined by a string containing plus and minus symbols {'+', '-', '0'} where '+' is case, '-' is control and '0' is neutral. 

This command adds two annotations to the VCF file:

 - **CaseControl**: Two comma separated numbers numbers representing the number of samples that have the variant in the case and the control group. Example: 

  "CaseControl=3,4" *the variant is present in 3 cases and 4 controls.*


 - **CaseControlP**: A p-value (Fisher exact test) that the number of cases is N or more. Example:

  "CaseControl=4,0;CaseControlP=3.030303e-02" *in this case the pValue of having 4 or more cases and zero controls is 0.03*


For example, if we have ten samples (which means ten genotype columns in the VCF file), the first four are 'case' and the last six are 'control', so the description string would be "++++------".  Let's say we want to distinguish genotypes that are homozygous in 'case' and either homozygous or heterozygous in 'control'.  We would set:

  - Hom/Het case = "hom"

  - Hom/Het control = "any"  

  - Case / Control column designation = ""++++------"


For details about this tool, please go to http://snpeff.sourceforge.net/SnpSift.html#casecontrol

	</help>
</tool>