Mercurial > repos > devteam > fastqc
view rgFastQC.xml @ 9:3a458e268066 draft
planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/tools/fastqc commit 8918618a5ef7bdca55a31cd919efa593044a376e
author | devteam |
---|---|
date | Wed, 02 Nov 2016 16:12:51 -0400 |
parents | 06819360a9e2 |
children | a00a6402d09a |
line wrap: on
line source
<tool id="fastqc" name="FastQC" version="0.67"> <description>Read Quality reports</description> <requirements> <requirement type="package" version="0.11.5">fastqc</requirement> </requirements> <stdio> <exit_code range="1:" /> <exit_code range=":-1" /> <regex match="Error:" /> <regex match="Exception:" /> </stdio> <command><![CDATA[ python '$__tool_directory__'/rgFastQC.py -i "$input_file" -d "$html_file.files_path" -o "$html_file" -t "$text_file" -f "$input_file.ext" -j "$input_file.name" #if $contaminants.dataset and str($contaminants) > '' -c "$contaminants" #end if #if $limits.dataset and str($limits) > '' -l "$limits" #end if ]]></command> <inputs> <param format="fastqsanger,fastq,bam,sam" name="input_file" type="data" label="Short read data from your current history" /> <param name="contaminants" type="data" format="tabular" optional="true" label="Contaminant list" help="tab delimited file with 2 columns: name and sequence. For example: Illumina Small RNA RT Primer CAAGCAGAAGACGGCATACGA"/> <param name="limits" type="data" format="txt" optional="true" label="Submodule and Limit specifing file" help="a file that specifies which submodules are to be executed (default=all) and also specifies the thresholds for the each submodules warning parameter" /> </inputs> <outputs> <data format="html" name="html_file" label="${tool.name} on ${on_string}: Webpage" /> <data format="txt" name="text_file" label="${tool.name} on ${on_string}: RawData" /> </outputs> <tests> <test> <param name="input_file" value="1000gsample.fastq" /> <param name="contaminants" value="fastqc_contaminants.txt" ftype="tabular" /> <output name="html_file" file="fastqc_report.html" ftype="html" lines_diff="100"/> <output name="text_file" file="fastqc_data.txt" ftype="txt" lines_diff="100"/> </test> <test> <param name="input_file" value="1000gsample.fastq" /> <param name="limits" value="fastqc_customlimits.txt" ftype="txt" /> <output name="html_file" file="fastqc_report2.html" ftype="html" compare="sim_size" delta="4096"/> <output name="text_file" file="fastqc_data2.txt" ftype="txt" compare="sim_size"/> </test> <test> <param name="input_file" value="1000gsample.fastq.gz" /> <param name="contaminants" value="fastqc_contaminants.txt" ftype="tabular" /> <output name="html_file" file="fastqc_report.html" ftype="html" lines_diff="100"/> <output name="text_file" file="fastqc_data.txt" ftype="txt" lines_diff="100"/> </test> </tests> <help> .. class:: infomark **Purpose** FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis. The main functions of FastQC are: - Import of data from BAM, SAM or FastQ/FastQ.gz files (any variant), - Providing a quick overview to tell you in which areas there may be problems - Summary graphs and tables to quickly assess your data - Export of results to an HTML based permanent report - Offline operation to allow automated generation of reports without running the interactive application ----- .. class:: infomark **FastQC** This is a Galaxy wrapper. It merely exposes the external package FastQC_ which is documented at FastQC_ Kindly acknowledge it as well as this tool if you use it. FastQC incorporates the Picard-tools_ libraries for sam/bam processing. The contaminants file parameter was borrowed from the independently developed fastqcwrapper contributed to the Galaxy Community Tool Shed by J. Johnson. Adaption to version 0.11.2 by T. McGowan. ----- .. class:: infomark **Inputs and outputs** FastQC_ is the best place to look for documentation - it's very good. A summary follows below for those in a tearing hurry. This wrapper will accept a Galaxy fastq, fastq.gz, sam or bam as the input read file to check. It will also take an optional file containing a list of contaminants information, in the form of a tab-delimited file with 2 columns, name and sequence. As another option the tool takes a custom limits.txt file that allows setting the warning thresholds for the different modules and also specifies which modules to include in the output. The tool produces a basic text and a HTML output file that contain all of the results, including the following: - Basic Statistics - Per base sequence quality - Per sequence quality scores - Per base sequence content - Per base GC content - Per sequence GC content - Per base N content - Sequence Length Distribution - Sequence Duplication Levels - Overrepresented sequences - Kmer Content All except Basic Statistics and Overrepresented sequences are plots. .. _FastQC: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ .. _Picard-tools: http://picard.sourceforge.net/index.shtml </help> <citations> <citation type="bibtex"> @ARTICLE{andrews_s, author = {Andrews, S.}, keywords = {bioinformatics, ngs, qc}, priority = {2}, title = {{FastQC A Quality Control tool for High Throughput Sequence Data}}, url = {http://www.bioinformatics.babraham.ac.uk/projects/fastqc/} } </citation> </citations> </tool>