view mothur/tools/mothur/align.seqs.xml @ 32:ec8df51e841a

Fixes courtesy of Peter Briggs: metagenomics.py: - Groups class: Fix for when second column not present - Axes class: make 'sniff' method more sensitive to try and restrict arbitrary tabular data uploads being sniffed as this type mothur_wrapper.py: - update cmd_dict['chimera.perseus'], to use correct inputs - add --beta option (needed for chimera.perseus tool) - add function for converting input floats from scientific notation (e.g. 1e-6, which mothur can't handle) to decimal format (e.g. 0.00001, which it can) chimera.perseus.xml: - add output data item for the "accnos" file (not previous captured in the history) trim.flows.xml: - cosmetic change: base name for additional output data items is now "trim.flows", rather than "logfile" (clarifies history items) shhh.flows.xml: - cosmetic change: update default value specified in help for mindiff to be consistent with that given in the mothur documentation
author Jim Johnson <jj@umn.edu>
date Wed, 04 Sep 2013 10:51:34 -0500
parents 49058b1f8d3f
children 95d75b35e4d2
line wrap: on
line source

<tool id="mothur_align_seqs" name="Align.seqs" version="1.19.0">
 <description>Align sequences to a template alignment</description>
 <command interpreter="python">
  mothur_wrapper.py 
  --cmd='align.seqs'
  --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.align$:'$out_file,'^\S+\.align\.report$:'$report
  --outputdir='$logfile.extra_files_path'
  --fasta=$candidate
  --reference=$alignment.template
  #if $search.method == 'kmer': 
   --ksize=$search.ksize
  #else: 
   --search=$search.method
  #end if
  --align=$align
  #if $scoring.adjust == 'yes': 
   --match=$scoring.match
   --mismatch=$scoring.mismatch
   --gapopen=$scoring.gapopen
   --gapextend=$scoring.gapextend
  #end if
  #if $reverse.flip == 'yes': 
   --flip=true
   --threshold=$reverse.threshold
  #end if
  --processors=8
 </command>
 <inputs>
  <param name="candidate" type="data" format="fasta" label="fasta - Candiate Sequences"/>
  <conditional name="alignment">
   <param name="source" type="select" label="Select Reference Template from" help="">
    <option value="ref">Cached Reference</option>
    <option value="history">Your History</option>
   </param>
   <when value="ref">
    <param name="template" type="select" label="reference - Select an alignment database " help="">
     <options from_file="mothur_aligndb.loc">
      <column name="name" index="0" />
      <column name="value" index="1" />
     </options>
    </param>
   </when>
   <when value="history">
    <param name="template" type="data" format="fasta" label="reference - Reference to align with" help=""/>
   </when>
  </conditional>
  <conditional name="search">
   <param name="method" type="select" label="Select a search method" help="">
    <option value="kmer">kmer (default)</option>
    <option value="suffix">suffix tree</option>
    <option value="blast">blast</option>
   </param>
   <when value="kmer">
    <param name="ksize" type="integer" value="8" label="ksize - kmer length between 5 and 12">
      <validator type="in_range" message="ksize - kmer length between 5 and 12" min="5" max="12"/>
    </param>
   </when>
   <when value="suffix"/>
   <when value="blast"/>
  </conditional>
  <param name="align" type="select" label="align - Select a pairwise alignment method" help="">
   <option value="needleman">needleman (default)</option>
   <option value="gotoh">gotoh</option>
   <option value="blast">blast</option>
  </param>

  <conditional name="scoring">
   <param name="adjust" type="select" label="Alignment scoring values" help="">
    <option value="no">use defaults</option>
    <option value="yes">adjust values</option>
   </param>
   <when value="no"/>
   <when value="yes">
    <param name="match" type="integer" value="1" label="match - Pairwise alignment reward for a match"/>
    <param name="mismatch" type="integer" value="-1" label="mismatch - Pairwise alignment penalty for a mismatch"/>
    <param name="gapopen" type="integer" value="-2" label="gapopen - Pairwise alignment penalty for opening a gap"/>
    <param name="gapextend" type="integer" value="-1" label="gapextend - Pairwise alignment penalty for extending a gap"/>
   </when>
  </conditional>

  <conditional name="reverse">
   <param name="flip" type="select" label="flip - Try to align against the reverse complement" help="">
    <option value="no">No</option>
    <option value="yes">Yes values</option>
   </param>
   <when value="no"/>
   <when value="yes">
    <param name="threshold" type="float" value=".5" label="threshold - Cutoff (0. - 1.) at which an alignment is deemed 'bad' and the reverse complement may be tried."/>
   </when>
  </conditional>

 </inputs>
 <outputs>
  <data format="html" name="logfile" label="${tool.name} on ${on_string}: logfile" />
  <data format="align" name="out_file" label="${tool.name} on ${on_string}: align" />
  <data format="align.report" name="report" label="${tool.name} on ${on_string}: align.report" />
 </outputs>
 <requirements>
  <requirement type="package" version="1.27">mothur</requirement>
 </requirements>
 <tests>
 </tests>
 <help>
**Mothur Overview**

Mothur_, initiated by Dr. Patrick Schloss and his software development team
in the Department of Microbiology and Immunology at The University of Michigan,
provides bioinformatics for the microbial ecology community.

.. _Mothur: http://www.mothur.org/wiki/Main_Page

**Command Documenation**

The align.seqs_ command aligns a user-supplied fasta-formatted candidate sequence file to a user-supplied fasta-formatted template_alignment_.

The general approach is to
 i) find the closest template for each candidate using kmer searching, blastn, or suffix tree searching;
 ii) to make a pairwise alignment between the candidate and de-gapped template sequences using the Needleman-Wunsch, Gotoh, or blastn algorithms; and
 iii) to re-insert gaps to the candidate and template pairwise alignments using the NAST algorithm so that the candidate sequence alignment is compatible with the original template alignment.

In general the alignment is very fast - we are able to align over 186,000 full-length sequences to the SILVA alignment in less than 3 hrs with a quality as good as the SINA aligner. Furthermore, this rate can be accelerated using multiple processors. While the aligner doesn't explicitly take into account the secondary structure of the 16S rRNA gene, if the template database is based on the secondary structure, then the resulting alignment will at least be implicitly based on the secondary structure.

.. _template_alignment: http://www.mothur.org/wiki/Alignment_database
.. _align.seqs: http://www.mothur.org/wiki/Align.seqs


 </help>
</tool>