Mercurial > repos > onnodg > cdhit_analysis
annotate README.md @ 3:c6981ea453ae draft default tip
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit ef31054ae26e19eff2f1b1f6c7979e39c47c0d5b-dirty
| author | onnodg |
|---|---|
| date | Fri, 24 Oct 2025 09:38:24 +0000 |
| parents | 706b7acdb230 |
| children |
| rev | line source |
|---|---|
|
2
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
1 This script processes cluster output files from cd-hit-est for use in Galaxy. |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
2 It extracts cluster information, associates taxa and e-values from annotation files, |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
3 performs statistical calculations, and generates text and plot outputs |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
4 summarizing similarity and taxonomic distributions. |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
5 |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
6 |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
7 Main steps: |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
8 1. Parse cd-hit-est cluster file and (optional) annotation file. |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
9 2. Process each cluster to extract similarity, taxa, and e-value information. |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
10 3. Aggregate results across clusters. |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
11 4. Generate requested outputs: text summaries, plots, and Excel reports. |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
12 |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
13 |
|
706b7acdb230
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_clusters_tool commit c2020ecc91cea0c8cf7439180cf796743c838b4d-dirty
onnodg
parents:
diff
changeset
|
14 Note: Uses a non-interactive matplotlib backend (Agg) for compatibility with Galaxy. |
