view README.md @ 3:c6981ea453ae draft default tip

planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit ef31054ae26e19eff2f1b1f6c7979e39c47c0d5b-dirty
author onnodg
date Fri, 24 Oct 2025 09:38:24 +0000
parents 706b7acdb230
children
line wrap: on
line source

This script processes cluster output files from cd-hit-est for use in Galaxy.
It extracts cluster information, associates taxa and e-values from annotation files,
performs statistical calculations, and generates text and plot outputs
summarizing similarity and taxonomic distributions.


Main steps:
1. Parse cd-hit-est cluster file and (optional) annotation file.
2. Process each cluster to extract similarity, taxa, and e-value information.
3. Aggregate results across clusters.
4. Generate requested outputs: text summaries, plots, and Excel reports.


Note: Uses a non-interactive matplotlib backend (Agg) for compatibility with Galaxy.