Mercurial > repos > iuc > spotyping
changeset 0:545d934aed81 draft
planemo upload for repository https://github.com/xiaeryu/SpoTyping-v2.0/tree/master/SpoTyping-v2.0-commandLine commit 71c2659a468b7d83f0d438ca6dc888bd8d66d3f5
author | iuc |
---|---|
date | Tue, 08 May 2018 10:22:03 -0400 |
parents | |
children | f82981245fbe |
files | spotyping.xml test-data/input.fastq.gz |
diffstat | 2 files changed, 104 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/spotyping.xml Tue May 08 10:22:03 2018 -0400 @@ -0,0 +1,104 @@ +<tool id="spotyping" name="SpoTyping" version="@TOOL_VERSION@+galaxy0" profile="17.01"> + <description>fast and accurate in silico Mycobacterium spoligotyping from sequence reads</description> + + <macros> + <token name="@TOOL_VERSION@">2.1</token> + </macros> + + <requirements> + <requirement type="package" version="@TOOL_VERSION@">spotyping</requirement> + </requirements> + + <command detect_errors="exit_code"><![CDATA[ + #set $input_file='input.' + $input.extension + ln -s '${input}' $input_file && + SpoTyping.py + --noQuery + $advanced.seq + $advanced.swift + $advanced.filter + $advanced.sorted + $input_file && + cat SpoTyping.log SpoTyping > '${output_txt}' + ]]> + </command> + + <inputs> + <param name="input" type="data" format="fastq,fastq.gz,fasta" label="Sequence reads" /> + <section name="advanced" title="Advanced options" expanded="false"> + <param type="boolean" argument="--seq" label="Input is assembled sequence" help="Input is either a complete genomic sequence or assembled contigs from an isolate" truevalue="--seq" falsevalue="" checked="false" /> + <param type="boolean" argument="--swift" label="Swift mode" checked="true" truevalue="--swift=on" falsevalue="--swift=off" /> + <param type="boolean" argument="--filter" label="Stringent filtering of reads" truevalue="--filter" falsevalue="" checked="false" /> + <param type="boolean" argument="--sorted" label="Reads are sorted to a reference genome" truevalue="--sorted" falsevalue="" /> + </section> + </inputs> + + <outputs> + <data name="output_txt" label="SpoTyping spoligotyping on ${on_string}" format="txt" /> + </outputs> + + <tests> + <test> + <param name="input" value="input.fastq.gz" ftype="fastq.gz" /> + <output name="output_txt"> + <assert_contents> + <has_text text="1111111111111111101111111111111100001111111" /> + </assert_contents> + </output> + </test> + </tests> + + <help><![CDATA[ + SpoTyping_ is a software for predicting spoligotype_ from sequencing reads, complete genomic sequences and assembled contigs. + + **Input:** + + - Fastq file - if paired end data is used, you may choose to concatenate paired reads into a single input (e.g. using the cat tool) + - Fasta file of a complete genomic sequence or assembled contigs of an isolate (with --seq option) + + *Note on input size*: In swift mode the sampling threshold is reached in approximately 30x coverage when using + paired end sequencing of a *M. tuberculosis* genome. + + **Output:** + + Count of hits from BLAST result for each spacer sequence and predicted spoligotype in the format of binary code and octal code. + + **Options:** + + \--seq + Set this if input is a fasta file that contains only complete genomic sequence or assembled contigs from an isolate. [Default is off] + + \-s SWIFT, --swift=SWIFT + Swift mode, either "on" or "off" [Default: on] - swift mode samples 250 million bases to use for spoligotyping + + \--sorted + Set if input reads are sorted relative to positions on a reference genome. If reads are sorted and swift mode is used, swift mode's sampling is adjusted + to sample reads across positions in the genome evenly. + + \--filter + Filter reads such that: + + 1. Leading and trailing 'N's would be removed. + 2. Any read with more than 3 'N's in the middle would be removed. + 3. Any read with more than 7 consecutive bases identical would be trimmed/filtered out given + the length of the flanking regions. + + **Got weird spoligotype prediction?** + + Sequencing throughput is very low (<40Mbp, for example): SpoTyping may not be able to give accurate prediction due to the relatively low read depth. + + **Interpreting the spoligotype** + + The binary or octal spoligotype can be used to look up lineage information using a service + like `TB Lineage`_. + + .. _SpoTyping: https://github.com/xiaeryu/SpoTyping-v2.0/tree/master/SpoTyping-v2.0-commandLine + .. _spoligotype: https://www.ncbi.nlm.nih.gov/pubmed/19521871 + .. _TB Lineage: http://tbinsight.cs.rpi.edu/run_tb_lineage.html + ]]> + </help> + + <citations> + <citation type="doi">10.1186/s13073-016-0270-7</citation> + </citations> +</tool>