Mercurial > repos > blankenberg > naive_variant_caller
annotate README.rst @ 12:ac0235d2d459 draft
planemo upload for repository https://github.com/blankenberg/tools-blankenberg/tree/master/tools/naive_variant_caller commit ce964ed3ab7e390754fa03bb32a593fbe79dcf04
| author | blankenberg |
|---|---|
| date | Thu, 17 Sep 2015 14:56:15 -0400 |
| parents | 907b40517289 |
| children | 5c852eca82e0 |
| rev | line source |
|---|---|
| 0 | 1 This repository contains the **Naive Variant Caller** tool. |
| 2 | |
| 3 ------ | |
| 4 | |
| 5 **What it does** | |
| 6 | |
| 7 This tool is a naive variant caller that processes aligned sequencing reads from the BAM format and produces a VCF file containing per position variant calls. This tool allows multiple BAM files to be provided as input and utilizes read group information to make calls for individual samples. | |
| 8 | |
| 9 User configurable options allow filtering reads that do not pass mapping or base quality thresholds and minimum per base read depth; user's can also specify the ploidy and whether to consider each strand separately. | |
| 10 | |
| 11 In addition to calling alternate alleles based upon simple ratios of nucleotides at a position, per base nucleotide counts are also provided. A custom tag, NC, is used within the Genotype fields. The NC field is a comma-separated listing of nucleotide counts in the form of <nucleotide>=<count>, where a plus or minus character is prepended to indicate strand, if the strandedness option was specified. | |
| 12 | |
| 13 | |
| 14 ------ | |
| 15 | |
| 16 **Inputs** | |
| 17 | |
| 18 Accepts one or more BAM input files and a reference genome from the built-in list or from a FASTA file in your history. | |
| 19 | |
| 20 | |
| 21 **Outputs** | |
| 22 | |
| 23 The output is in VCF format. | |
| 24 | |
| 25 Example VCF output line, without reporting by strand: | |
| 26 ``chrM 16029 . T G,A,C . . AC=15,9,5;AF=0.00155311658729,0.000931869952371,0.000517705529095 GT:AC:AF:NC 0/0:15,9,5:0.00155311658729,0.000931869952371,0.000517705529095:A=9,C=5,T=9629,G=15,`` | |
| 27 | |
| 28 Example VCF output line, when reporting by strand: | |
| 29 ``chrM 16029 . T G,A,C . . AC=15,9,5;AF=0.00155311658729,0.000931869952371,0.000517705529095 GT:AC:AF:NC 0/0:15,9,5:0.00155311658729,0.000931869952371,0.000517705529095:+T=3972,-A=9,-C=5,-T=5657,-G=15,`` | |
| 30 | |
| 31 **Options** | |
| 32 | |
| 33 Reference Genome: | |
| 34 | |
| 35 Ensure that you have selected the correct reference genome, either from the list of built-in genomes or by selecting the corresponding FASTA file from your history. | |
| 36 | |
| 37 Restrict to regions: | |
| 38 | |
| 39 You can specify any number of regions on which you would like to receive results. You can specify just a chromosome name, or a chromosome name and start postion, or a chromosome name and start and end position for the set of desired regions. | |
| 40 | |
| 41 Minimum number of reads needed to consider a REF/ALT: | |
| 42 | |
| 43 This value declares the minimum number of reads containing a particular base at each position in order to list and use said allele in genotyping calls. Default is 0. | |
| 44 | |
| 45 Minimum base quality: | |
| 46 | |
| 47 The minimum base quality score needed for the position in a read to be used for nucleotide counts and genotyping. Default is no filter. | |
| 48 | |
| 49 Minimum mapping quality: | |
| 50 | |
| 51 The minimum mapping quality score needed to consider a read for nucleotide counts and genotyping. Default is no filter. | |
| 52 | |
| 53 Ploidy: | |
| 54 | |
| 55 The number of genotype calls to make at each reported position. | |
| 56 | |
|
10
907b40517289
Fix typo ("with with") in readme.
Daniel Blankenberg <dan@bx.psu.edu>
parents:
0
diff
changeset
|
57 Only write out positions with possible alternate alleles: |
| 0 | 58 |
| 59 When set, only positions which have at least one non-reference nucleotide which passes declare filters will be present in the output. | |
| 60 | |
| 61 Report counts by strand: | |
| 62 | |
| 63 When set, nucleotide counts (NC) will be reported in reference to the aligned read's source strand. Reported as: <strand><BASE>=<COUNT>. | |
| 64 | |
| 65 Choose the dtype to use for storing coverage information: | |
| 66 | |
| 67 This controls the maximum depth value for each nucleotide/position/strand (when specified). Smaller values require the least amount of memory, but have smaller maximal limits. | |
| 68 | |
| 69 +--------+----------------------------+ | |
| 70 | name | maximum coverage value | | |
| 71 +========+============================+ | |
| 72 | uint8 | 255 | | |
| 73 +--------+----------------------------+ | |
| 74 | uint16 | 65,535 | | |
| 75 +--------+----------------------------+ | |
| 76 | uint32 | 4,294,967,295 | | |
| 77 +--------+----------------------------+ | |
| 78 | uint64 | 18,446,744,073,709,551,615 | | |
| 79 +--------+----------------------------+ | |
| 80 | |
| 81 | |
| 82 ------ | |
| 83 | |
| 84 **Citation** | |
| 85 | |
| 86 If you use this tool, please cite Blankenberg D, et al. *In preparation.* |
