# HG changeset patch # User q2d2 # Date 1714080123 0 # Node ID 4366a48a594af4b351859026ed850444a95449b8 planemo upload for repository https://github.com/qiime2/galaxy-tools/tree/main/tools/suite_qiime2__rescript commit 389df0134cd0763dcf02aac6e623fc15f8861c1e diff -r 000000000000 -r 4366a48a594a qiime2__rescript__evaluate_fit_classifier.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/qiime2__rescript__evaluate_fit_classifier.xml Thu Apr 25 21:22:03 2024 +0000 @@ -0,0 +1,96 @@ + + + + + Evaluate and train naive Bayes classifier on reference sequences. + + quay.io/qiime2/amplicon:2024.2 + + q2galaxy version rescript + q2galaxy run rescript evaluate_fit_classifier '$inputs' + + + + + + + + + hasattr(value.metadata, "semantic_type") and value.metadata.semantic_type in ['FeatureData[Sequence]'] + + + + + + hasattr(value.metadata, "semantic_type") and value.metadata.semantic_type in ['FeatureData[Taxonomy]'] + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + +QIIME 2: rescript evaluate-fit-classifier +========================================= +Evaluate and train naive Bayes classifier on reference sequences. + + +Outputs: +-------- +:classifier.qza: Trained naive Bayes taxonomic classifier. +:evaluation.qzv: Visualization of classification accuracy results. +:observed_taxonomy.qza: Observed taxonomic label for each input sequence, predicted by the trained classifier. + +| + +Description: +------------ +Train a naive Bayes classifier on a set of reference sequences, then test performance accuracy on this same set of sequences. This results in a "perfect" classifier that "knows" the correct identity of each input sequence. Such a leaky classifier indicates the upper limit of classification accuracy based on sequence information alone, as misclassifications are an indication of unresolvable kmer profiles. This test simulates the case where all query sequences are present in a fully comprehensive reference database. To simulate more realistic conditions, see `evaluate_cross_validate`. THE CLASSIFIER OUTPUT BY THIS PIPELINE IS PRODUCTION-READY and can be re-used for classification of other sequences (provided the reference data are viable), hence THIS PIPELINE IS USEFUL FOR TRAINING FEATURE CLASSIFIERS AND THEN EVALUATING THEM ON-THE-FLY. + + +| + + + + 10.1186/s40168-018-0470-z + 10.1371/journal.pcbi.1009581 + 10.1038/s41587-019-0209-9 + +
diff -r 000000000000 -r 4366a48a594a test-data/.gitkeep