Mercurial > repos > devteam > best_regression_subsets
annotate best_regression_subsets.xml @ 2:4f33ec73e445 default tip
Corrected version string.
author | devteam <devteam@galaxyproject.org> |
---|---|
date | Thu, 10 Apr 2014 13:47:30 -0400 |
parents | e769cde223a5 |
children |
rev | line source |
---|---|
2
4f33ec73e445
Corrected version string.
devteam <devteam@galaxyproject.org>
parents:
1
diff
changeset
|
1 <tool id="BestSubsetsRegression1" name="Perform Best-subsets Regression" version="1.0.0"> |
0 | 2 <description> </description> |
1 | 3 <requirements> |
4 <requirement type="package" version="1.7.1">numpy</requirement> | |
5 <requirement type="package" version="1.0.3">rpy</requirement> | |
6 </requirements> | |
0 | 7 <command interpreter="python"> |
8 best_regression_subsets.py | |
9 $input1 | |
10 $response_col | |
11 $predictor_cols | |
12 $out_file1 | |
13 $out_file2 | |
14 1>/dev/null | |
15 2>/dev/null | |
16 </command> | |
17 <inputs> | |
18 <param format="tabular" name="input1" type="data" label="Select data" help="Dataset missing? See TIP below."/> | |
19 <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" /> | |
20 <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" multiple="true" > | |
21 <validator type="no_options" message="Please select at least one column."/> | |
22 </param> | |
23 </inputs> | |
24 <outputs> | |
25 <data format="input" name="out_file1" metadata_source="input1" /> | |
26 <data format="pdf" name="out_file2" /> | |
27 </outputs> | |
28 <tests> | |
29 <!-- Testing this tool will not be possible because this tool produces a pdf output file. | |
30 --> | |
31 </tests> | |
32 <help> | |
33 | |
34 .. class:: infomark | |
35 | |
36 **TIP:** If your data is not TAB delimited, use *Edit Datasets->Convert characters* | |
37 | |
38 ----- | |
39 | |
40 .. class:: infomark | |
41 | |
42 **What it does** | |
43 | |
44 This tool uses the 'regsubsets' function from R statistical package for regression subset selection. It outputs two files, one containing a table with the best subsets and the corresponding summary statistics, and the other containing the graphical representation of the results. | |
45 | |
46 ----- | |
47 | |
48 .. class:: warningmark | |
49 | |
50 **Note** | |
51 | |
52 - This tool currently treats all predictor and response variables as continuous variables. | |
53 | |
54 - Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis. | |
55 | |
56 - The 6 columns in the output are described below: | |
57 | |
58 - Column 1 (Vars): denotes the number of variables in the model | |
59 - Column 2 ([c2 c3 c4...]): represents a list of the user-selected predictor variables (full model). An asterix denotes the presence of the corresponding predictor variable in the selected model. | |
60 - Column 3 (R-sq): the fraction of variance explained by the model | |
61 - Column 4 (Adj. R-sq): the above R-squared statistic adjusted, penalizing for higher number of predictors (p) | |
62 - Column 5 (Cp): Mallow's Cp statistics | |
63 - Column 6 (bic): Bayesian Information Criterion. | |
64 | |
65 | |
66 </help> | |
67 </tool> |