Galaxy |

FROGS Affiliation postprocess (version 3.1)

Abundance file of affiliated OTUs:

Abundances of affiliated OTUs (format: BIOM).

OTU seed sequences:

OTU sequences (format: fasta).

Is this an hyper variable in length amplicon ?:

Multi-affiliation tag may be resolved by selecting the shortest amplicon reference. For this you need the reference fasta file of your kind of amplicon.

minimum identity for aggregation:

OTUs will be aggregated if they share the same taxonomy with at least X% identity.

minimum coverage for aggregation:

OTUs will be aggregated if they share the same taxonomy with at least X% alignment coverage.

Resolves multi-hit ambiguities if exact amplicon length are available and aggregrated OTUs sharing same taxonomy based on alignment metrics thresholds

Inputs

Abundance file:

The abundance of each OTU in each sample (format BIOM) with taxonomic affiliations metadata.

Sequence file:

The sequences (format FASTA) of each OTU seed.

Reference file (optionnal):

The exact amplicon reference sequences (format FASTA).

Outputs

Abundance file:

The abundance file of OTUs and aggregated OTUs, with their affiliation (format BIOM) and with potentially less ambiguities.

Sequence file:

The sequences (format FASTA) of each aggregated OTU seed.

Composition file:

The aggregation composition file (format text) describing the composition of each resulting OTU.

If a reference fasta file is provided, for each OTU with multiaffiliation, among the different possible affiliations, we only keep the affiliation of the sequence with the shorter length. The aim is to resolve ambiguities due to potential inclusive sequences such as ITS.

Second step is the OTUs aggregation that share the same taxonomy inferred on alignment metrics. We start with the most abundant OTU. If an OTU shares at least one affiliation with another OTU with at least I% of identity and C% of alignment coverage, so the OTUs are aggregated together (The different affiliations, which then generate the multi-affiliation tag, are merged, abundance counts are summed). The seed of the most abundant OTU is kept. ----

Contact

Support: please contact first your galaxy support team.

Contacts: frogs@inra.fr

Repository: https://github.com/geraldinepascal/FROGS website: http://frogs.toulouse.inra.fr/

Please cite the FROGS article: Escudie F., et al. Bioinformatics, 2018. FROGS: Find, Rapidly, OTUs with Galaxy Solution.