On Galaxy, you have to use paired collections as input for HybPiper assemblies. HybPiper relies on the directory hierarchy it creates for each sample during assembly. The hierarchy is based on the name of the sample, which you provide to Galaxy as the identifier in the collection.
If you have your sequencing reads in individual datasets, you can easily organise them into a paired collection. See the Galaxy training material on using dataset collections for a step-by-step guide.
Note: because HybPiper uses sample identifiers to create directories, you can't use special characters in your sample identifiers. The only allowed characters are letters, numbers, underscores and hyphens.
You can't use single-end and unpaired reads as input to Hybpiper on Galaxy.
The following HybPiper analyses are available on Galaxy:
Use the Type of hybpiper run drop-down to select an analysis.
HybPiper was designed for processing targeted sequence capture data. In targeted sequence capture, DNA sequencing libraries are enriched for gene regions of interest. This is used for sequencing many loci simultaneously based on bait sequences.
HybPiper is a suite of scripts that wrap and connect other tools to extract target sequences from the sequencing reads. The HybPiper pipeline starts with high-throughput sequencing reads (for example from Illumina MiSeq), and assigns them to target genes using DIAMOND. The reads are distributed to separate directories, where they are assembled separately using SPAdes. The main output is a collection of FASTA files of the (in frame) CDS portion of the sample for each target region. You can also generate a separate collections of files with the translated protein sequences, the intronic regions flanking each exon, and putative paralog sequences.
For more information, please see the HybPiper wiki.