ConcatPhyl (version 2.0.2)

Files from Filter assemblies:

Aligned files without indels:

nucleic or proteic format according to the analysis you want to do below

Which format do you want to use for this tool (concatenation and RAxML run) ? :

Substitution Model:

Random seed used for the parsimony inferences:

Number of runs:

Specify the number of alternative runs (-N|#) on distinct starting trees In combination with the '-b' option will invoke a multiple boostrap analysis. You can add the bootstopping criteria by choosing the autoMR, autoMRE, autoMRE_IGN, or autoFC value in a menu below instead of providing a number here. Bootstopping will only work in combination with '-x' or '-b'.

Use bootstopping criteria for number of runs:

Algorithm to execute:

Multiple model assignment to alignment partitions:

Specify the file name which contains the assignment of models to alignment partitions for multiple models of substitution. For the syntax of this file please consult the manual.

Rapid bootstrapping random seed:

Specify a random seed and turn on rapid bootstrapping. CAUTION: unlike in version 7.0.4 RAxML will conduct rapid BS replicates under the model of rate heterogeneity you specified via '-m' and not by default under CAT.

What format of file do you want for your output (concatenation of the sequences) ? :

Do you want the output of RAxML : best tree ? :

Do you want the output of RAxML : bi-partition ? :

Do you want the output of RAxML : bootstrap ? :

Only if the option 'rapid bootsptrap' is chosen. When you don't want to choose your options, this output is accessible

Authors Eric Fontanillas created the version 1 of this pipeline. Victor Mataigne developped version 2.

Galaxy integration Julie Baffard and ABiMS TEAM, Roscoff Marine Station

Contact support.abims@sb-roscoff.fr for any questions or concerns about the Galaxy implementation of this tool.

Credits : Gildas le Corguillé, Misharl Monsoor

Description

This tool takes files containing fasta sequences (from the CDS_Search in the AdaptSearch suite) and run RAxML to build a phylogeny.

full RAxML manual here

Parameters

The choice of the format sequences is possible : proteic or nucleic

Several RAxML parameters can be set :
- Substitution model (-m) : Model of Binary (Morphological), Nucleotide, Multi-state, or Amino-Acid substitution
  
  Default : GTRGAMMA (nucleic), PROTCAT (proteic).
- Matrix : AA substitution model (when proteic inputs)
  
  Default : DAYHOFF
- random seed : Specifies a random number seed for the parsimony inferences. For all options/algorithms in RAxML that require some sort of randomization, this option must be specified. Make sure to pass different random number seeds to RAxML and not only 12345.
- Number of runs (-N) : Specifies the number of alternative runs.
  
  By default it's an integer of value 100.
- Use bootstopping criteria for number of runs :
  
  If selected, overxwrites the number of runs to use bootstopping criteria.
- Algorithm to execute (-f) : allows to choose what kind of algorithme RAxML shall execute.
  
  Default : Rapid bootsrap and best ML tree search (-f a).
- Multiple model assignement t oalignment partitions (-q) : an optional parameter. Permits to specify the file name which contains the assignment of models to alignment partitions for multiple models of substitution. For the syntax of this file please consult the manual.
  
  This option allows you to specify the regions of your alignment for which an individual model of nucleotide substitution should be estimated. This will typically be useful to infer trees for long multi-gene alignments.
- Rapid bootstrapping random seed (-x) : Specify an integer number (random seed) and turn on rapid bootstrapping.
  
  In addition to the best tree search. By default, this option is choosen.

Inputs

Files from Filter Assemblies : a set of fasta files (one file per species), e.g. the outputs of the first tool of the AdaptSearch suite.

Used to retrieve all the species names.

Alignment files without indels : a set of fasta files with aligned sequences (with the same species than into the previous parameter), e.g the outputs of the CDS_Search tool of the AdaptSearch suite.

Outputs

This tool, produces the following files :

Phylogeny :

the general output. It gives the information about the concatenation (statistics) and the RAxML run.

Phylogeny_concatenation_fasta_aa :

contains the sequences concatenated in fasta format when you choose the option proteic.

Phylogeny_concatenation_phylip_aa :

contains the sequences concatenated in phylip format when you choose the option proteic.

Phylogeny_concatenation_nexus_aa :

contains the sequences concatenated in nexus format when you choose the option proteic.

Phylogeny_concatenation_fasta_nuc :

contains the sequences concatenated in fasta format when you choose the option nucleic.

Phylogeny_concatenation_phylip_nuc :

contains the sequences concatenated in phylip format when you choose the option nucleic. it's this output which is used for the RAxML run.

Phylogeny_concatenation_nexus_nuc :

contains the sequences concatenated in nexus format when you choose the option nucleic.

Phylogeny_RAxML_BestTree** :

the output of RAxML run which contains the Best Tree found.

Phylogeny_RAxML_BiPartitionBranchLabel :

the output of RAxML run which contains the Best Tree found with supported values as branch labels.

Phylogeny_RAxML_BiPartition :

the output of RAxML run which contains the Best Tree found with supported values.

Phylogeny_RAxML_BootStrap :

the output of RAxML run which contains all the boostrapped trees. The number of boostraped trees depending of the option -N (number of run).

The AdaptSearch Pipeline

/repository/static/images/4a43eb903452582e/adaptsearch_picture_helps.png

Changelog

Version 2.0 - 06/07/2017

NEW: Replace the zip between tools by Dataset Collection

Version 1.0 - 13/04/2017

Add funtional test with planemo

Planemo test with conda dependencies for raxml and python

Scripts renamed + symlinks to the directory 'scripts'