What it does
Remove chimeric sequences by sample.
Context
Chimeras are sequences formed from two or more biological sequences joined together.
The majority of these anomalous sequences are formed from an incomplete extension during a PCR cycle. During subsequent cycles, a partially extended strand can bind to a template derived from a different but similar sequence.
This phenomena is particularly common in amplicon sequencing where closely related sequences are amplified.
Inputs/Outputs
Inputs
Sequence file:
The sequences (format FASTA).
Abundance file:
The abundance of each cluster in each sample (format BIOM).
OR
The abundance of each sequence in each sample (format TSV). This type of file is produced by FROGS pre-process.
Example:
#id splA splB seq1 1289 2901 seq2 3415 0
Outputs
Sequence file (non_chimera.fasta):
The sequence file with only non-chimera (format FASTA).
Abundance file (non_chimera.biom or non_chimera.tsv):
The abundance file with only non-chimera (format the same of the abundance input).
Summary file (report.html):
This file presents the number of removed elements (format HTML).
How it works
Steps | Description |
---|---|
1 | Split input data by sample (classicaly the PCR is realised by sample). |
2 | Find chimera in each sample (vsearch). |
3 | Remove the sequences identify as chimera in all samples where they are present. |
Contact
Contacts: frogs@inra.fr
Repository: https://github.com/geraldinepascal/FROGS
Please cite the FROGS Publication: Escudie F., Auer L., Bernard M., Cauquil L., Vidal K., Maman S., Mariadassou M., Hernadez-Raquet G., Pascal G., 2015. FROGS: Find Rapidly OTU with Galaxy Solution. In: The environmental genomic Conference, Montpellier, France, http://bioinfo.genotoul.fr/fileadmin/user_upload/FROGS_2015_GE_Montpellier_poster.pdf
Depending on the help provided you can cite us in acknowledgements, references or both.