view GALAXY_FILES/tools/EMBER/Integrate_Data.xml @ 4:e960969a92ae default tip

Uploaded
author mmaiensc
date Thu, 22 Mar 2012 14:07:11 -0400
parents 037c3edda16e
children
line wrap: on
line source

<tool id="integrate_data" name="Integrate Data" version="1.3.1">
  <description>Step 2 of analysis: assigns potential targets to binding sites</description>
  <command interpreter="perl">Integrate_Data.pl -b $binding_data -e $expression_data -o $output -d $dist -dt $dtype -v n</command>
  <inputs>
    <param format="txt" name="binding_data" type="data" label="Binding data"/>
    <param format="txt" name="expression_data" type="data" label="Discretized expression data"/>
    <param name="dist" type="integer" min="0" label="Max distance (kbp)" value="100" optional="true"/>
    <param name="dtype" type="select" label="Distance type">
        <option value="1" selected="true">To gene boundaries</option>
        <option value="2">To TSS</option>
    </param>
  </inputs>
  <outputs>
    <data format="txt" name="output"/>
  </outputs>

  <tests>
    <test>
      <param name="binding_data" value="EMBER/peaks.txt"/>
      <param name="expression_data" value="EMBER/expression_profiles.txt"/>
      <param name="dist" value="100"/>
      <param name="dtype" value="1"/>
      <output name="output" file="EMBER/integrated.txt"/>
    </test>
  </tests>

  <help>

This tool combines binding data with annotated gene/probe sets to assign potential targets to each binding site.

-----

Description of inputs:

*Binding Data*:

   Binding data in bed-like format (note only the first coordinate after [chr] is used, so if you have the regular bed format, you may want to add a new second column with the average of the start and end coordinates). [other information] is retained throughout the analysis, and may contain peak ID, peak enrichment, etc.

   *Format (at least 2 columns)*: [chr] [peak posn] [other information]

*Discretized Expression Data*: output of PreProcess Expression Data.

*Max Distance*: maximum distance from a peak in order to consider a gene a potential target (in kbp).

*Distance Type*: definition of distance (to gene boundaries or to TSS). If "To gene boundaries" is chosen, peaks lying within a gene's coordinates have a distance of 0.

  </help>

</tool>