comparison test-data/consensus_anno_stats.txt @ 2:9ca209477dfd draft default tip

planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_annotations_tool commit 4017d38cf327c48a6252e488ba792527dae97a70-dirty
author onnodg
date Mon, 15 Dec 2025 16:43:36 +0000
parents
children
comparison
equal deleted inserted replaced
1:2acf82433aa4 2:9ca209477dfd
1 Starting processing for FASTA: test-data/consensus_anno_test.fasta
2 === PARAMETERS USED ===
3 input_anno: test-data/consensus_test_header.tabular
4 input_unanno: test-data/consensus_anno_test.fasta
5 eval_plot: test-data/consensus_eval.png
6 taxa_output: test-data/consensus_taxa.txt
7 circle_data: test-data/consensus_circle.txt
8 header_anno: test-data/consensus_annotations.xlsx
9 anno_stats: test-data/consensus_anno_stats.txt
10 filtered_fasta: test-data/consensus_filtered_fasta.fasta
11 uncertain_threshold: 90.0
12 eval_threshold: 1e-10
13 use_counts: False
14 ignore_rank: unkown
15 ignore_taxonomy: environmental
16 bitscore_perc_cutoff: 10.0
17 min_bitscore: 40
18 ignore_obiclean_type: singleton
19 ignore_illuminapairend_type: pairend
20 min_identity: 70
21 min_coverage: 70
22 ignore_seqids:
23 min_support: 2
24 === END PARAMETERS ===
25 Filtered FASTA written to: test-data/consensus_filtered_fasta.fasta (269 sequences)
26 FASTA: total headers: 1937
27 FASTA: headers kept after filters and min_support=2: 269
28 FASTA: removed due to header filters (illumina/obiclean/etc.): 365
29 FASTA: removed due to low dereplicated count (<2): 1303
30 FASTA: total invalid (header filter + low support): 1668
31 Reading BLAST annotations: test-data/consensus_test_header.tabular
32 BLAST: total hits read: 889
33 BLAST: hits kept after quality filters: 767
34 BLAST: hits filtered (evalue/coverage/identity/bitscore): 122
35 BLAST: hits removed due to invalid taxon: 0
36 BLAST: hits removed due to ignored seqids: 0
37 Note: 37 BLAST q_ids not in FASTA (showing up to 10): ['M01687:460:000000000-LGY9G:1:1101:12578:8821_CONS(1)', 'M01687:460:000000000-LGY9G:1:1102:19139:9563_CONS(1)', 'M01687:460:000000000-LGY9G:1:2114:3286:13205_CONS(1)', 'M01687:460:000000000-LGY9G:1:2115:26033:6510_CONS(1)', 'M01687:460:000000000-LGY9G:1:2115:28971:12582_CONS(1)', 'M01687:460:000000000-LGY9G:1:2115:9332:15134_CONS(1)', 'M01687:460:000000000-LGY9G:1:2112:16849:24934_CONS(1)', 'M01687:460:000000000-LGY9G:1:2113:2910:9568_CONS(2)', 'M01687:460:000000000-LGY9G:1:2113:24519:22653_CONS(1)', 'M01687:460:000000000-LGY9G:1:2108:19954:3938_CONS(1)']
38 ANNOTATION: total FASTA headers considered: 269
39 ANNOTATION: reads with BLAST hits: 4
40 ANNOTATION: reads without BLAST hits: 265
41 ANNOTATION: unique annotated count (from header counts): 384
42 ANNOTATION: total unique count (from FASTA): 24251
43 E-value plot written to: test-data/consensus_eval.png
44 Taxa summary written to: test-data/consensus_taxa.txt
45 Header annotations written to: test-data/consensus_annotations.xlsx
46 Circle diagram JSON written to: test-data/consensus_circle.txt
47 === ANNOTATION STATISTICS ===
48 percentage_annotated: 0.20650490449148168
49 annotated_sequences: 4
50 total_sequences: 1937
51 percentage_unique_annotated: 1.5834398581501794
52 unique_annotated: 384
53 total_unique: 24251