Mercurial > repos > testtool > accuracy
comparison accuracy.xml @ 2:6169ba9ed42a draft
Uploaded
author | testtool |
---|---|
date | Fri, 13 Oct 2017 10:10:32 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
1:a3a8499f0f95 | 2:6169ba9ed42a |
---|---|
1 <tool id="accuracy" name="accuracy" version="1.0.0"> | |
2 <description>model creation and accuracy estimation</description> | |
3 <requirements> | |
4 <requirement type="package" version="6.0_76">r-caret</requirement> | |
5 </requirements> | |
6 <command detect_errors="aggressive"> | |
7 Rscript '$__tool_directory__/accuracy.R' '$input' '$p' '$output1' '$output2' | |
8 </command> | |
9 <inputs> | |
10 <param format="csv" type="data" name="input" value="" label="Input dataset" help=" | |
11 e.g. iris species table | |
12 Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species | |
13 5.1,3.5,1.4,0.2,Iris-setosa | |
14 4.9,3,1.4,0.2,Iris-setosa | |
15 4.7,3.2,1.3,0.2,Iris-setosa | |
16 4.6,3.1,1.5,0.2,Iris-setosa''"/> | |
17 <param name="p" type="integer" value="0.80" label="Select % of data to training and testing the models"/> | |
18 </inputs> | |
19 <outputs> | |
20 <data format="csv" name="output1" label="dataset_summary.csv" /> | |
21 <data format="csv" name="output2" label="accuracy_summary.csv" /> | |
22 </outputs> | |
23 <tests> | |
24 <test> | |
25 <param name="test"> | |
26 <element name="test-data"> | |
27 <collection type="data"> | |
28 <element format="csv" name="input" label="test-data/input.csv"/> | |
29 </collection> | |
30 </element> | |
31 </param> | |
32 <output format="csv" name="fit" label="test-data/dataset_summary.csv"/> | |
33 <output format="csv" name="fit" label="test-data/accuracy_summary.csv"/> | |
34 </test> | |
35 </tests> | |
36 <help> | |
37 Tool allow us to build 5 different models to predict e.g. species from flower measurements. | |
38 In the end we can select the best model for further analysis. | |
39 | |
40 Let’s evaluate 5 different algorithms: | |
41 | |
42 **Linear Discriminant Analysis (LDA)** | |
43 **Classification and Regression Trees (CART).** | |
44 **k-Nearest Neighbors (kNN).** | |
45 **Support Vector Machines (SVM) with a linear kernel.** | |
46 **Random Forest (RF)** | |
47 | |
48 This is a good mixture of simple linear (LDA), nonlinear (CART, kNN) and complex nonlinear methods (SVM, RF). | |
49 We reset the random number seed before reach run to ensure that the evaluation of each algorithm is performed | |
50 using exactly the same data splits. It ensures the results are directly comparable. | |
51 | |
52 </help> | |
53 <citations> | |
54 <citation>https://CRAN.R-project.org/package=caret</citation> | |
55 </citations> | |
56 </tool> |