view tools/new_operations/coverage.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
line wrap: on
line source

<tool id="gops_coverage_1" name="Coverage">
  <description>of a set of intervals on second set of intervals</description>
  <command interpreter="python">gops_coverage.py $input1 $input2 $output -1 ${input1.metadata.chromCol},${input1.metadata.startCol},${input1.metadata.endCol},${input1.metadata.strandCol} -2 ${input2.metadata.chromCol},${input2.metadata.startCol},${input2.metadata.endCol},${input2.metadata.strandCol}</command>
  <inputs>
    <param format="interval" name="input1" type="data" help="First dataset">
      <label>What portion of</label>
    </param>
    <param format="interval" name="input2" type="data" help="Second dataset">
      <label>is covered by</label>
    </param>
   </inputs>
  <outputs>
    <data format="interval" name="output" metadata_source="input1" />
  </outputs>
  <code file="operation_filter.py"/>
  <tests>
    <test>
      <param name="input1" value="1.bed" />
      <param name="input2" value="2.bed" />
      <output name="output" file="gops_coverage_out.interval" />
    </test>
    <test>
      <param name="input1" value="1.bed" />
      <param name="input2" value="2_mod.bed" ftype="interval"/>
      <output name="output" file="gops_coverage_out_diffCols.interval" />
    </test>
    <test>
      <param name="input1" value="gops_bigint.interval" />
      <param name="input2" value="gops_bigint2.interval" />
      <output name="output" file="gops_coverage_out2.interval" />
    </test>
  </tests>
  <help>

.. class:: infomark

**TIP:** If your dataset does not appear in the pulldown menu -> it is not in interval format. Use "edit attributes" to set chromosome, start, end, and strand columns.

Find the coverage of intervals in the first dataset on intervals in the second dataset.  The coverage is added as two columns, the first being bases covered, and the second being the fraction of bases covered by that interval.

-----

**Screencasts!**

See Galaxy Interval Operation Screencasts_ (right click to open this link in another window).

.. _Screencasts: http://wiki.g2.bx.psu.edu/Learn/Interval%20Operations

-----

**Example**


    if **First dataset** are genes ::

      chr11 5203271 5204877 NM_000518 0 -
      chr11 5210634 5212434 NM_000519 0 -
      chr11 5226077 5227663 NM_000559 0 -
      chr11 5226079 5232587 BC020719  0 -
      chr11 5230996 5232587 NM_000184 0 -

    and **Second dataset** are repeats::

       chr11      5203895 5203991 L1MA6     500 +
       chr11      5204163 5204239 A-rich    219 +
       chr11      5211034 5211167 (CATATA)n 245 +
       chr11      5211642 5211673 AT_rich    24 +
       chr11      5226551 5226606 (CA)n     303 +
       chr11      5228782 5228825 (TTTTTG)n 208 +
       chr11      5229045 5229121 L1PA11    440 +
       chr11      5229133 5229319 MER41A   1106 +
       chr11      5229374 5229485 L2        244 -
       chr11      5229751 5230083 MLT1A     913 -
       chr11      5231469 5231526 (CA)n     330 +

    the Result is the coverage density of repeats in the genes::

       chr11 5203271 5204877 NM_000518 0 - 172   0.107098
       chr11 5210634 5212434 NM_000519 0 - 164   0.091111
       chr11 5226077 5227663 NM_000559 0 -  55   0.034678
       chr11 5226079 5232587 BC020719  0 - 860   0.132145
       chr11 5230996 5232587 NM_000184 0 -  57   0.035827

    For example, the following line of output::

      chr11 5203271 5204877 NM_000518 0 - 172   0.107098

   implies that 172 nucleotides accounting for 10.7% of the this interval (chr11:5203271-5204877) overlap with repetitive elements.

</help>
</tool>