kmersvm: kmersvm/train.xml comparison

comparison kmersvm/train.xml @ 7:fd740d515502 draft default tip

Uploaded revised kmer-SVM to include modules from kmer-visual.

author	cafletezbrant
date	Sun, 16 Jun 2013 18:06:14 -0400
parents	7fe1103032f7
children

comparison

equal deleted inserted replaced

-:1aea7c1a9ab1
+:fd740d515502
 		</param>
 		<when value="custom">
 		<param name="weight" type="float" value="1" label="Input The Value of Positive Set Weight" />
 		</when>
 </conditional>
-<param name="SVMC" type="integer" value="1" label="Regularization Param C" />
+<param name="SVMC" type="float" value="1" label="Regularization Param C" >
-<param name="EPS" type="float" value="0.00001" label="Precision Param E" />
+	<validator type="in_range" message="SVMC must be in range 1 - 10" min="0.01" max="1" />
+</param>
+<param name="EPS" type="float" value="0.00001" label="Precision Param E" >
+	<validator type="in_range" message="EPS must be in range 1e-1 to 1e-5" min="0.00001" max="0.1" />
+</param>
 </inputs>
 <outputs>
 <data format="tabular" name="SVM_weights" from_work_dir="kmersvm_output_weights.out" label="${tool.name} on ${on_string} : Weights" />
 <data format="tabular" name="CV_predictions" from_work_dir="kmersvm_output_cvpred.out" label="${tool.name} on ${on_string} : Predictions" />
 </outputs>
 **What it does**
 Takes as input 2 FASTA files, 1 of positive sequences and 1 of negative sequences.  Produces 2 outputs:
-A) Weights: list of sequences of length K ranked by score and posterior probability for that score.
+A) Weights: list of sequences of length K ranked by score.
-B) Predictions: results of N-fold cross validation
+B) Predictions: results of N-fold cross validation.
+----
+**Recommended Settings**
+Kernel: Spectrum
+Kmer length: 6
+N-Fold Cross-Validation: 5
+Weight: We recommend letting the Positive Set Weight be selected automatically, unless it has been separately optimized.
+Regularization Parameter C: We recommend values between 0.1 and 1.
+Precision Parameter E: We recommend using the default and staying below 0.1.
 ----
 **Parameters**
 Kernel: 2 choices:
 A) Spectrum Kernel: Analyzes a sequence using strings of length K.
-B) Weighted Spectrum Kernel: Analyzes a sequence using strings of range of lengths K1 - Kn.
+B) Weighted Spectrum Kernel: Analyzes a sequence using strings of range of lengths K_min - K_max.
 N-Fold Cross Validation: Number of partitions of training data used for cross validation.
 Weight: Increases importance of positive data (increase if positive sets are very trustworthy or for training with very large negative sequence sets).
 Regularization Parameter: Penalty for misclassification.  Trade-off is overfitting (high parameter) versus high error rate (low parameter).
 Precision Parameter:  Insensitivity zone.  Affects precision of SVM by altering number of support vectors used.
 ----
 **Example**
 Weights file::

Mercurial > repos > cafletezbrant > kmersvm

comparison kmersvm/train.xml @ 7:fd740d515502 draft default tip