view kmersvm/tomtom.xml @ 7:fd740d515502 draft default tip

Uploaded revised kmer-SVM to include modules from kmer-visual.
author cafletezbrant
date Sun, 16 Jun 2013 18:06:14 -0400
parents
children
line wrap: on
line source

<tool id="tomtom" name="Tomtom" version="1.0.0">

	<description>Tomtom tool for motif searching</description>
	<command>/home/galaxy/meme/bin/tomtom -no-ssc -internal -text -verbosity 1 -thresh $thresh 
		#if str($cut.cut_choice) == 'e.value':
			-evalue
		#end if

		#if str($dist.dist) == 'ed':
			-dist ed
		#elif str($dist.dist) == 'sw':
			-dist sandelin
		#else
			-dist pearson	
		#end if
	
	 $input1 /home/galaxy/meme/db/combined_db.meme > tomtom_out.txt
	 
	 </command>
	 <inputs>
	 	<param format="txt" name="input1" type="data" label="PWM File"/>
		<param type="float" value="0.5" label="Threshold" name="thresh"/>
	 	<conditional name="cut">
	 		<param name="cut_choice" type="select" label="Threshold Type">
	 			<option value="q.value" selected="true">q-value</option>
	 			<option value="e.value">E-value</option>
	 		</param>	
	 	</conditional>
	 	
	 	<conditional name="dist">
	 		<param name="dist" type="select" label="Distance Metric">
	 			<option value="pearson" selected="true">Pearson</option>
	 			<option value="ed">Euclidean</option>
	 			<option value="sw">Sandelin-Wasserman Function</option>
	 		</param>
	 	</conditional>
	 </inputs>
	 
	 <outputs>
	 	<data format="txt" name="Tomtom Results" from_work_dir="tomtom_out.txt" label="${tool.name} on ${on_string}: Tomtom Matches"/>

	 </outputs>
	<help>

Tomtom is a tool for comparing a DNA motif to a database of known motifs.  For an in-depth explanation of the Tomtom software see here_.

----

**Recommended Settings**

We recommend most users use the Tomtom defaults of q-value for score, the cutoff of 0.5 and the Pearson correlation coefficent for distance metric.

----

**Parameters**

We offer users the options of choosing which distance metric can be used to find matching motifs. Specifically, we offer the Pearson correlation coefficient, the Euclidean distance and the Sandelin-Wasserman Function.

  * The Pearson correlation coefficient measures the similarity between columns of position weight matrices (PWMs).

  * The Euclidean distance can be thought of as the length of the straight line between two PWMs.

  * The Sandelin-Wasserman function sums the column-wise differences between PWMs.

We also offer the choice of E-value and q-value to threshold the results returned by Tomtom.

  * The E-value controls the expected number of false positives and can be any number.  

  * The q-value controls the false discovery rate and is a number between 0 and 1.

----

Note that at this time we only offer Tomtom output in txt format.

----

**Citation**

If you use this tool, please cite: Shobhit Gupta, JA Stamatoyannopolous, Timothy Bailey and William Stafford Noble, "Quantifying similarity between motifs", Genome Biology, 8(2):R24, 2007.

.. _here: http://meme.nbcr.net/meme/tomtom-intro.html

  </help>
</tool>