Mercurial > repos > miller-lab > genome_diversity
view coverage_distributions.xml @ 39:e56023008e36 default tip
Changed revision of package_fisher_0_1_4 to be2fc454d121
Changed revision of package_matplotlib_1_2 to a03ee94316b5
author | miller-lab |
---|---|
date | Mon, 06 Jul 2015 10:32:24 -0400 |
parents | a631c2f6d913 |
children |
line wrap: on
line source
<tool id="gd_coverage_distributions" name="Coverage Distributions" version="1.0.0"> <description>: Examine sequence coverage for SNPs</description> <command interpreter="python"> #import json #import base64 #import zlib #set $ind_names = $input.dataset.metadata.individual_names #set $ind_colms = $input.dataset.metadata.individual_columns #set $ind_dict = dict(zip($ind_names, $ind_colms)) #set $ind_json = json.dumps($ind_dict, separators=(',',':')) #set $ind_comp = zlib.compress($ind_json, 9) #set $ind_arg = base64.b64encode($ind_comp) coverage_distributions.py '$input' '0' '$output' '$output.files_path' '$ind_arg' #if $individuals.choice == '0' 'all_individuals' #else if $individuals.choice == '1' #set $arg = 'individuals:%s' % str($individuals.p1_input) '$arg' #else if $individuals.choice == '2' #for $population in $individuals.populations #set $arg = 'population:%s:%s' % (str($population.p_input), str($population.p_input.name)) '$arg' #end for #end if </command> <inputs> <param name="input" type="data" format="gd_snp" label="SNP dataset" /> <conditional name="individuals"> <param name="choice" type="select" label="Compute for"> <option value="0" selected="true">All individuals</option> <option value="1">Individuals in a population</option> <option value="2">Totals of populations</option> </param> <when value="0" /> <when value="1"> <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" /> </when> <when value="2"> <repeat name="populations" title="Population" min="1"> <param name="p_input" type="data" format="gd_indivs" label="individuals" /> </repeat> </when> </conditional> <!-- <param name="data_source" type="select" label="Data source"> <option value="0" selected="true">Sequence coverage</option> <option value="1">Genotype quality</option> </param> --> </inputs> <outputs> <data name="output" format="html" /> </outputs> <requirements> <requirement type="package" version="0.1">gd_c_tools</requirement> </requirements> <tests> <test> <param name="input" value="test_in/sample.gd_snp" ftype="gd_snp" /> <param name="choice" value="0" /> <output name="output" file="test_out/coverage_distributions/coverage.html" ftype="html" compare="diff" lines_diff="2"> <extra_files type="file" name="coverage.pdf" value="test_out/coverage_distributions/coverage.pdf" compare="sim_size" delta = "1000"/> <extra_files type="file" name="coverage.txt" value="test_out/coverage_distributions/coverage.txt" /> </output> </test> </tests> <help> **Dataset formats** The input dataset is in gd_snp_ format. The output is a composite dataset, containing both a text table and a PDF plot. (`Dataset missing?`_) .. _gd_snp: ./static/formatHelp.html#gd_snp .. _Dataset missing?: ./static/formatHelp.html ----- **What it does** This tool reports distributions of a SNP reliability indicator, in this case sequence coverage, for individuals or populations. The coverage can be computed for all individuals, a subset of individuals, or totals for populations defined by the Specify Individuals tool. The results are reported as a text table giving the cumulative distributions, and as a plot. ----- **Example** - input:: chr1 14929 A G 999 21 30 1 127 7 11 1 28 7 29 0 5 2 5 1 17 10 14 1 81 17 74 1 42 15 22 1 125 29 84 1 88 6 10 1 11 30 23 1 79 19 1 2 71 24 0 2 99 41 10 2 2 chr1 17451 C T 6.88 119 1 2 255 12 0 2 63 35 0 2 59 14 0 2 72 19 1 2 57 101 1 2 255 38 8 1 20 125 0 2 255 13 0 2 62 42 0 2 51 44 0 2 64 26 0 2 108 59 0 2 194 chr1 30922 G T 999 0 23 0 66 0 0 -1 0 0 0 -1 0 0 0 -1 0 0 2 0 3 0 14 0 39 14 16 1 153 0 45 0 132 6 0 2 48 19 0 2 87 3 0 2 32 0 0 -1 0 0 0 -1 0 etc. - text output:: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 John West 0 0 0 0 0 0 0 0 1 1 1 1 2 2 3 3 4 4 5 6 NA12892 0 2 5 11 20 31 43 55 67 77 84 90 93 96 97 98 99 99 99 99 NA12891 0 0 0 0 0 1 1 2 3 5 6 9 11 15 19 23 29 35 41 47 NA12249 1 4 11 23 38 54 68 79 88 93 96 98 99 99 99 99 99 99 99 99 NA12342 0 0 1 1 2 4 6 9 13 18 23 29 36 43 50 58 65 71 77 82 KB1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 2 2 ABT 0 0 0 0 0 0 1 1 1 2 3 4 5 6 8 10 12 14 18 21 NA18507 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 NA19238 0 0 0 1 2 4 6 10 14 19 25 32 39 47 55 62 69 76 81 86 NA19239 0 0 0 0 1 1 2 4 5 8 11 15 19 24 31 37 44 51 58 65 YH 2 4 6 7 8 8 9 10 11 12 14 17 19 22 25 29 32 36 40 45 KOREAN 0 0 1 1 3 4 5 7 10 12 15 19 22 27 31 37 42 48 54 60 JPT 0 0 0 0 0 0 0 0 1 1 1 2 2 3 4 5 7 8 10 12 etc. graphical output: .. image:: $PATH_TO_IMAGES/gd_coverage.png </help> </tool>