Mercurial > repos > testtool > accuracy
diff accuracy.xml @ 4:4494c973f643 draft default tip
Deleted selected files
author | testtool |
---|---|
date | Fri, 13 Oct 2017 10:15:08 -0400 |
parents | a5a5716e0317 |
children |
line wrap: on
line diff
--- a/accuracy.xml Fri Oct 13 10:14:29 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,56 +0,0 @@ -<tool id="accuracy" name="accuracy" version="1.0.0"> - <description>model creation and accuracy estimation</description> - <requirements> - <requirement type="package" version="6.0_76">r-caret</requirement> - </requirements> - <command detect_errors="aggressive"> - Rscript '$__tool_directory__/accuracy.R' '$input' '$p' '$output1' '$output2' - </command> -<inputs> - <param format="csv" type="data" name="input" value="" label="Input dataset" help=" - e.g. iris species table -Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species -5.1,3.5,1.4,0.2,Iris-setosa -4.9,3,1.4,0.2,Iris-setosa -4.7,3.2,1.3,0.2,Iris-setosa -4.6,3.1,1.5,0.2,Iris-setosa''"/> - <param name="p" type="integer" value="0.80" label="Select % of data to training and testing the models"/> - </inputs> - <outputs> - <data format="csv" name="output1" label="dataset_summary.csv" /> - <data format="csv" name="output2" label="accuracy_summary.csv" /> - </outputs> - <tests> - <test> - <param name="test"> - <element name="test-data"> - <collection type="data"> - <element format="csv" name="input" label="test-data/input.csv"/> - </collection> - </element> - </param> - <output format="csv" name="fit" label="test-data/dataset_summary.csv"/> - <output format="csv" name="fit" label="test-data/accuracy_summary.csv"/> - </test> - </tests> - <help> -Tool allow us to build 5 different models to predict e.g. species from flower measurements. -In the end we can select the best model for further analysis. - -Let’s evaluate 5 different algorithms: - -**Linear Discriminant Analysis (LDA)** -**Classification and Regression Trees (CART).** -**k-Nearest Neighbors (kNN).** -**Support Vector Machines (SVM) with a linear kernel.** -**Random Forest (RF)** - -This is a good mixture of simple linear (LDA), nonlinear (CART, kNN) and complex nonlinear methods (SVM, RF). -We reset the random number seed before reach run to ensure that the evaluation of each algorithm is performed -using exactly the same data splits. It ensures the results are directly comparable. - -</help> -<citations> - <citation>https://CRAN.R-project.org/package=caret</citation> -</citations> -</tool>