Galaxy |

Remove chimeric sequences by sample.

Chimeras are sequences formed from two or more biological sequences joined together.

The majority of these anomalous sequences are formed from an incomplete extension during a PCR cycle. During subsequent cycles, a partially extended strand can bind to a template derived from a different but similar sequence.

This phenomena is particularly common in amplicon sequencing where closely related sequences are amplified.

Inputs

Sequence file:

The sequences (format FASTA).

Abundance file:

The abundance of each cluster in each sample (format BIOM).

The abundance of each sequence in each sample (format TSV). This type of file is produced by FROGS pre-process.

Example:
#id        splA    splB
seq1       1289    2901
seq2       3415    0

Outputs

Sequence file (non_chimera.fasta):

The sequence file with only non-chimera (format FASTA).

Abundance file (non_chimera.biom or non_chimera.tsv):

The abundance file with only non-chimera (format the same of the abundance input).

Summary file (report.html):

This file presents the number of removed elements (format HTML).

Steps	Description
1	Split input data by sample (classicaly the PCR is realised by sample).
2	Find chimera in each sample (vsearch).
3	Remove the sequences identify as chimera in all samples where they are present.

Contact

Contacts: frogs@inra.fr

Repository: https://github.com/geraldinepascal/FROGS

Please cite the FROGS Publication: Escudie F., Auer L., Bernard M., Cauquil L., Vidal K., Maman S., Mariadassou M., Hernadez-Raquet G., Pascal G., 2015. FROGS: Find Rapidly OTU with Galaxy Solution. In: The environmental genomic Conference, Montpellier, France, http://bioinfo.genotoul.fr/fileadmin/user_upload/FROGS_2015_GE_Montpellier_poster.pdf

Depending on the help provided you can cite us in acknowledgements, references or both.