view FastQ_QualConverter.xml @ 1:3990d6b37e2d draft default tip

Uploaded
author geert-vandeweyer
date Thu, 13 Feb 2014 08:24:43 -0500
parents
children
line wrap: on
line source

<tool id="fastq_qual_convert" name="FASTQ QualityConverter" version="1.0.4">
  <description>convert from various base-FASTQ quality formats to fastqsanger</description>
  <command interpreter="perl">FastQ_QualConverter.pl -i '$input_file' -f '$input_type' -o '$output_file'</command>
  <inputs>
    <param name="input_file" type="data" format="fastq" label="File to Convert" />
    <param name="input_type" type="select" label="Input FASTQ quality scores type">
      <option value='Auto' selected="True">Auto</option>
      <option value="solexa">Solexa</option>
      <option value="illumina">Illumina 1.3-1.7</option>
      <option value="sanger">Sanger (does nothing)</option>
    </param>
  </inputs>
  <outputs>
    <data name="output_file" format="fastqsanger">
    </data>
  </outputs>
  <tests>
    <!-- These tests include test files adapted from supplemental material in Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2009 Dec 16. -->
    <!-- Unfortunately, cannot test for expected failures -->
    <!-- Test basic options -->
    <test>
      <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" />
      <param name="input_type" value="sanger" />
      <output name="output_file" file="sanger_full_range_original_sanger.fastqsanger" />
    </test>
    <test>
      <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" />
      <param name="input_type" value="illumina" />
      <output name="output_file" file="illumina_full_range_as_sanger.fastqsanger" />
    </test>
    <test>
      <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" />
      <param name="input_type" value="solexa" />
      <output name="output_file" file="solexa_full_range_as_sanger.fastqsanger" />
    </test>
  </tests>
  <help>
**What it does**

This tool offers several conversions options relating to the FASTQ format.Output is always fastqsanger. Input can be specified or auto detected (based on first 15000 reads).

Hopefully it is faster than the default fastq groomer. 


-----

**Quality Score Comparison**

::

    SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
    ...............................IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
    ..........................XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    !"#$%&amp;'()*+,-./0123456789:;&lt;=&gt;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
    |                         |    |        |                              |                     |
   33                        59   64       73                            104                   126
  
   S - Sanger       Phred+33,  93 values  (0, 93) (0 to 60 expected in raw reads) (sanger = input)
   I - Illumina 1.3 Phred+64,  62 values  (0, 62) (0 to 40 expected in raw reads) (sanger = input - 31)
   X - Solexa       Solexa+64, 67 values (-5, 62) (-5 to 40 expected in raw reads) (sanger = 33 + 10 * log(1 + 10 ** (input) - 64) / 10.0)) / log(10);

Diagram adapted from http://en.wikipedia.org/wiki/FASTQ_format

.. class:: infomark

Output from Illumina 1.8+ pipelines are Sanger encoded.


  </help>
</tool>