changeset 0:a0dd76408b0c

Uploaded
author xuebing
date Sat, 31 Mar 2012 14:00:28 -0400
parents
children 11e3609fa73c
files bed_shuffle_chrom.xml
diffstat 1 files changed, 46 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/bed_shuffle_chrom.xml	Sat Mar 31 14:00:28 2012 -0400
@@ -0,0 +1,46 @@
+<tool id="bed_shuffle_chrom" name="bed_shuffle_chrom">
+  <description>shuffle intervals weight chromosome by length</description>
+  <command interpreter="python">bed_shuffle_chrom.py $input $output $within $genome </command>
+  <inputs>
+    <param name="input" format="interval" type="data" label="reference interval file to mimic"/>
+    <param name="within" label="randomize within chromosome" help="If checked, for each original interval will move it to a random position in the SAME chromosome. The default is to move it to any chromosome (chance proportional to chromosome size)" type="boolean" truevalue="within" falsevalue="across" checked="False"/>
+
+            <param name="genome" type="select" label="Select chromsome size file" >
+                <options from_file="chrsize.loc">
+                    <column name="name" index="0"/>
+                    <column name="value" index="1"/>
+                </options>
+            </param>
+  </inputs>
+  <outputs>
+    <data format="interval" name="output" />
+  </outputs>
+  <help>
+
+
+**What it does**
+
+This tool will generate a set of intervals randomly distributed in the genome, mimicking the size distribution of the reference set. The same number of intervals are generated.
+
+
+**How it works**
+
+For each interval in the reference set, the script picks a random position as the new start in the genome, and then pick the end such that the size of the random interval is the same as the original one. The default setting is to move the interval to any chromosome, with the probability proportional to the size/length of the chromosome. You can have it pick a random position in the same chromosome, such that in the randomized set each chromosome has the same number of intervals as the reference set. The size of the chromosome can be either learned from the reference set (chromosome size = max(interval end)) or read from a chromosome size file. When learning from the reference set, only regions spanned by reference intervals are used to generate random intervals. Regions (may be an entire chromosome) not covered by the reference set will not appear in the output.
+
+**Chromosome size file**
+
+Chromosome size files for hg18,hg19,mm8,and mm9 can be found in 'Shared Data'. To use those files, select the correct one and import into to the history, then the file will be listed in the drop-down menu of this tool. You can also make your own chromosme size file: each line specifies the size of a chromosome (tab-delimited):
+
+chr1 92394392
+
+chr2 232342342    
+
+
+You can use the following script from UCSC genome browser to download chromosome size files for other genomes:
+  
+http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/fetchChromSizes
+
+
+  </help>
+  
+</tool>