Mercurial > repos > onnodg > blast_annotations_processor
diff test-data/test_curated_nov_anno_out.txt @ 2:9ca209477dfd draft default tip
planemo upload for repository https://github.com/Onnodg/Naturalis_NLOOR/tree/main/NLOOR_scripts/process_annotations_tool commit 4017d38cf327c48a6252e488ba792527dae97a70-dirty
| author | onnodg |
|---|---|
| date | Mon, 15 Dec 2025 16:43:36 +0000 |
| parents | |
| children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/test_curated_nov_anno_out.txt Mon Dec 15 16:43:36 2025 +0000 @@ -0,0 +1,53 @@ +Starting processing for FASTA: test-data/test_curated_nov.fasta +=== PARAMETERS USED === +input_anno: test-data/test_curated_nov_blast_headers.tabular +input_unanno: test-data/test_curated_nov.fasta +eval_plot: test-data/test_curated_nov_eval.png +taxa_output: test-data/test_curated_nov_taxa_output.txt +circle_data: test-data/test_curated_nov_circle_data.txt +header_anno: test-data/test_curated_nov_header_anno_excel.xlsx +anno_stats: test-data/test_curated_nov_anno_out.txt +filtered_fasta: test-data/test_curated_nov_filtered.fasta +uncertain_threshold: 0.9 +eval_threshold: 1e-10 +use_counts: True +ignore_rank: unknown +ignore_taxonomy: environmental +bitscore_perc_cutoff: 8.0 +min_bitscore: 100 +ignore_obiclean_type: singleton +ignore_illuminapairend_type: pairend +min_identity: 80 +min_coverage: 70 +ignore_seqids: +min_support: 1 +=== END PARAMETERS === +Filtered FASTA written to: test-data/test_curated_nov_filtered.fasta (1790 sequences) +FASTA: total headers: 2156 +FASTA: headers kept after filters and min_support=1: 1790 +FASTA: removed due to header filters (illumina/obiclean/etc.): 366 +FASTA: removed due to low dereplicated count (<1): 0 +FASTA: total invalid (header filter + low support): 366 +Reading BLAST annotations: test-data/test_curated_nov_blast_headers.tabular +BLAST: total hits read: 4977 +BLAST: hits kept after quality filters: 3145 +BLAST: hits filtered (evalue/coverage/identity/bitscore): 1832 +BLAST: hits removed due to invalid taxon: 0 +BLAST: hits removed due to ignored seqids: 0 +Note: 30 BLAST q_ids not in FASTA (showing up to 10): ['M01687:460:000000000-LGY9G:1:1101:11918:3518_CONS(1)', 'M01687:460:000000000-LGY9G:1:1101:12996:3690_CONS(1)', 'M01687:460:000000000-LGY9G:1:1101:11564:11468_CONS(1)', 'M01687:460:000000000-LGY9G:1:1102:19358:5472_CONS(1)', 'M01687:460:000000000-LGY9G:1:2114:4805:4734_CONS(1)', 'M01687:460:000000000-LGY9G:1:2114:7472:19038_CONS(1)', 'M01687:460:000000000-LGY9G:1:2112:26865:11154_CONS(1)', 'M01687:460:000000000-LGY9G:1:2113:29518:11119_CONS(1)', 'M01687:460:000000000-LGY9G:1:2113:14681:23251_CONS(1)', 'M01687:460:000000000-LGY9G:1:2110:17890:1754_CONS(2)'] +ANNOTATION: total FASTA headers considered: 1790 +ANNOTATION: reads with BLAST hits: 622 +ANNOTATION: reads without BLAST hits: 1168 +ANNOTATION: unique annotated count (from header counts): 49571 +ANNOTATION: total unique count (from FASTA): 66132 +E-value plot written to: test-data/test_curated_nov_eval.png +Taxa summary written to: test-data/test_curated_nov_taxa_output.txt +Header annotations written to: test-data/test_curated_nov_header_anno_excel.xlsx +Circle diagram JSON written to: test-data/test_curated_nov_circle_data.txt +=== ANNOTATION STATISTICS === +percentage_annotated: 28.84972170686456 +annotated_sequences: 622 +total_sequences: 2156 +percentage_unique_annotated: 74.95766043670235 +unique_annotated: 49571 +total_unique: 66132 \ No newline at end of file
