diff test-data/daan_anno_stats.txt @ 2:9ca209477dfd draft default tip

planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_annotations_tool commit 4017d38cf327c48a6252e488ba792527dae97a70-dirty
author onnodg
date Mon, 15 Dec 2025 16:43:36 +0000
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/daan_anno_stats.txt	Mon Dec 15 16:43:36 2025 +0000
@@ -0,0 +1,53 @@
+Starting processing for FASTA: test-data/daan_test_unanno.fasta
+=== PARAMETERS USED ===
+input_anno: test-data/daan_test.tabular
+input_unanno: test-data/daan_test_unanno.fasta
+eval_plot: test-data/daan_eval.png
+taxa_output: test-data/daan_taxa.txt
+circle_data: test-data/daan_test.txt
+header_anno: test-data/daan_test.xlsx
+anno_stats: test-data/daan_anno_stats.txt
+filtered_fasta: test-data/daan_filtered_fasta.fasta
+uncertain_threshold: 90.0
+eval_threshold: 1e-10
+use_counts: False
+ignore_rank: unkown
+ignore_taxonomy: environmental
+bitscore_perc_cutoff: 10.0
+min_bitscore: 40
+ignore_obiclean_type: singleton
+ignore_illuminapairend_type: pairend
+min_identity: 70
+min_coverage: 70
+ignore_seqids: 
+min_support: 1
+=== END PARAMETERS ===
+Filtered FASTA written to: test-data/daan_filtered_fasta.fasta (414 sequences)
+FASTA: total headers: 532
+FASTA: headers kept after filters and min_support=1: 414
+FASTA: removed due to header filters (illumina/obiclean/etc.): 118
+FASTA: removed due to low dereplicated count (<1): 0
+FASTA: total invalid (header filter + low support): 118
+Reading BLAST annotations: test-data/daan_test.tabular
+BLAST: total hits read: 70
+BLAST: hits kept after quality filters: 70
+BLAST: hits filtered (evalue/coverage/identity/bitscore): 0
+BLAST: hits removed due to invalid taxon: 0
+BLAST: hits removed due to ignored seqids: 0
+Note: 15 BLAST q_ids not in FASTA (showing up to 10): ['M01687:476:000000000-LL5F5:1:1102:16245:9240_CONS(1)', 'M01687:476:000000000-LL5F5:1:2114:3313:18654_CONS(3)', 'M01687:476:000000000-LL5F5:1:2112:19173:20011_CONS(1)', 'M01687:476:000000000-LL5F5:1:2111:13710:23471_CONS(2)', 'M01687:476:000000000-LL5F5:1:2107:11226:8080_CONS(1)', 'M01687:476:000000000-LL5F5:1:2104:21459:14659_CONS(1)', 'M01687:476:000000000-LL5F5:1:2103:8294:17591_CONS(1)', 'M01687:476:000000000-LL5F5:1:2103:20035:24420_CONS(1)', 'M01687:476:000000000-LL5F5:1:2101:19159:13262_CONS(1)', 'M01687:476:000000000-LL5F5:1:1114:20282:19626_CONS(1)']
+ANNOTATION: total FASTA headers considered: 414
+ANNOTATION: reads with BLAST hits: 2
+ANNOTATION: reads without BLAST hits: 412
+ANNOTATION: unique annotated count (from header counts): 36
+ANNOTATION: total unique count (from FASTA): 3682
+E-value plot written to: test-data/daan_eval.png
+Taxa summary written to: test-data/daan_taxa.txt
+Header annotations written to: test-data/daan_test.xlsx
+Circle diagram JSON written to: test-data/daan_test.txt
+=== ANNOTATION STATISTICS ===
+percentage_annotated: 0.37593984962406013
+annotated_sequences: 2
+total_sequences: 532
+percentage_unique_annotated: 0.9777294948397609
+unique_annotated: 36
+total_unique: 3682
\ No newline at end of file