annotate tools/filters/ucsc_gene_bed_to_exon_bed.xml @ 1:cdcb0ce84a1b

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:45:15 -0500
parents 9071e359b9a3
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="gene2exon1" name="Gene BED To Exon/Intron/Codon BED">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description>expander</description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">ucsc_gene_bed_to_exon_bed.py --input=$input1 --output=$out_file1 --region=$region "--exons"</command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 <param name="region" type="select">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 <label>Extract</label>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <option value="transcribed">Coding Exons + UTR Exons</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 <option value="coding">Coding Exons only</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 <option value="utr5">5'-UTR Exons</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <option value="utr3">3'-UTR Exons</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 <option value="intron">Introns</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <option value="codon">Codons</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 </param>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 <param name="input1" type="data" format="bed" label="from" help="this history item must contain a 12 field BED (see below)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 <data name="out_file1" format="bed"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 <param name="input1" value="3.bed" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 <param name="region" value="transcribed" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 <output name="out_file1" file="cf-gene2exon.dat"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 .. class:: warningmark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 This tool works only on a BED file that contains at least 12 fields (see **Example** and **About formats** below). The output will be empty if applied to a BED file with 3 or 6 fields.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34 **What it does**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36 BED format can be used to represent a single gene in just one line, which contains the information about exons, coding sequence location (CDS), and positions of untranslated regions (UTRs). This tool *unpacks* this information by converting a single line describing a gene into a collection of lines representing individual exons, introns, UTRs, etc.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 -------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40 **Example**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42 Extracting **Coding Exons + UTR Exons** from the following two BED lines::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44 chr7 127475281 127491632 NM_000230 0 + 127486022 127488767 0 3 29,172,3225, 0,10713,13126
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 chr7 127486011 127488900 D49487 0 + 127486022 127488767 0 2 155,490, 0,2399
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 will return::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 chr7 127475281 127475310 NM_000230 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50 chr7 127485994 127486166 NM_000230 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 chr7 127488407 127491632 NM_000230 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52 chr7 127486011 127486166 D49487 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 chr7 127488410 127488900 D49487 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 .. class:: infomark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 **About formats**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 **BED format** Browser Extensible Data format was designed at UCSC for displaying data tracks in the Genome Browser. It has three required fields and additional optional ones. In the specific case of this tool the following fields must be present::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 1. chrom - The name of the chromosome (e.g. chr1, chrY_random).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64 2. chromStart - The starting position in the chromosome. (The first base in a chromosome is numbered 0.)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65 3. chromEnd - The ending position in the chromosome, plus 1 (i.e., a half-open interval).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66 4. name - The name of the BED line.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67 5. score - A score between 0 and 1000.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68 6. strand - Defines the strand - either '+' or '-'.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69 7. thickStart - The starting position where the feature is drawn thickly at the Genome Browser.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70 8. thickEnd - The ending position where the feature is drawn thickly at the Genome Browser.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 9. reserved - This should always be set to zero.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
72 10. blockCount - The number of blocks (exons) in the BED line.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
73 11. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
74 12. blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
75
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
76
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
77 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
78 </tool>