view CADDSuite-1.5/galaxyconfigs/tools/InputPartitioner.xml @ 7:bfab27640f5e draft

CADDSuite version 1.5
author Marcel Schumann <schumann.marcel@gmail.com>
date Tue, 24 Jul 2012 11:13:59 -0400
parents
children
line wrap: on
line source


<!--This is a configuration file for the integration of a CADDSuite tool into Galaxy (http://usegalaxy.org). This file was automatically generated using GalaxyConfigGenerator, so do not bother to make too many manual modifications.-->
<tool id="inputpartitioner" name="InputPartitioner" version="1.5">
    <description>split QSAR data set</description>
    <command interpreter="bash"><![CDATA[../../InputPartitioner 
#if str( $i ) != ''  and str( $i ) != 'None' :
   -i "$i"
#end if
#if str( $o ) != ''  and str( $o ) != 'None' :
   -o "$o"
#end if
#if str( $n ) != ''  and str( $n ) != 'None' :
   -n "$n"
#end if
 | tail -n 5
]]></command>
    <inputs>
        <param name="i" optional="false" label="input data-file" type="data" format="dat"/>
        <param name="n" optional="false" label="number of partitions" type="text" area="true" size="1x5" value=""/>
    </inputs>
    <outputs>
        <data name="o" format="dat"/>
    </outputs>
    <help>InputPartitioner partitions a given QSAR data set into n partitions with evenly distributed response values.
Thus, this tool can be useful as part of a nested validation pipeline.
Input is a data file as generated by InputReader.
Output will be written to n files postfixed '_TRAIN&lt;i&gt;.dat' and '_TEST&lt;i&gt;.dat', where &lt;i&gt; is the ID of the resp. partition. For each of these partitions, the training set contains only those compounds that were not selected for the resp. test set.</help>
</tool>