annotate bed_shuffle_chrom.xml @ 1:11e3609fa73c

Uploaded
author xuebing
date Sat, 31 Mar 2012 14:00:39 -0400
parents a0dd76408b0c
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
1 <tool id="bed_shuffle_chrom" name="bed_shuffle_chrom">
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
2 <description>shuffle intervals weight chromosome by length</description>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">bed_shuffle_chrom.py $input $output $within $genome </command>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
4 <inputs>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
5 <param name="input" format="interval" type="data" label="reference interval file to mimic"/>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
6 <param name="within" label="randomize within chromosome" help="If checked, for each original interval will move it to a random position in the SAME chromosome. The default is to move it to any chromosome (chance proportional to chromosome size)" type="boolean" truevalue="within" falsevalue="across" checked="False"/>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
7
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
8 <param name="genome" type="select" label="Select chromsome size file" >
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
9 <options from_file="chrsize.loc">
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
10 <column name="name" index="0"/>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
11 <column name="value" index="1"/>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
12 </options>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
13 </param>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
14 </inputs>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
15 <outputs>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
16 <data format="interval" name="output" />
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
17 </outputs>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
18 <help>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
19
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
20
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
21 **What it does**
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
22
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
23 This tool will generate a set of intervals randomly distributed in the genome, mimicking the size distribution of the reference set. The same number of intervals are generated.
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
24
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
25
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
26 **How it works**
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
27
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
28 For each interval in the reference set, the script picks a random position as the new start in the genome, and then pick the end such that the size of the random interval is the same as the original one. The default setting is to move the interval to any chromosome, with the probability proportional to the size/length of the chromosome. You can have it pick a random position in the same chromosome, such that in the randomized set each chromosome has the same number of intervals as the reference set. The size of the chromosome can be either learned from the reference set (chromosome size = max(interval end)) or read from a chromosome size file. When learning from the reference set, only regions spanned by reference intervals are used to generate random intervals. Regions (may be an entire chromosome) not covered by the reference set will not appear in the output.
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
29
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
30 **Chromosome size file**
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
31
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
32 Chromosome size files for hg18,hg19,mm8,and mm9 can be found in 'Shared Data'. To use those files, select the correct one and import into to the history, then the file will be listed in the drop-down menu of this tool. You can also make your own chromosme size file: each line specifies the size of a chromosome (tab-delimited):
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
33
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
34 chr1 92394392
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
35
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
36 chr2 232342342
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
37
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
38
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
39 You can use the following script from UCSC genome browser to download chromosome size files for other genomes:
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
40
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
41 http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/fetchChromSizes
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
42
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
43
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
44 </help>
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
45
a0dd76408b0c Uploaded
xuebing
parents:
diff changeset
46 </tool>