annotate bed_to_gff.xml @ 6:154887a3d92f

Uploaded
author vipints
date Thu, 23 Apr 2015 17:35:13 -0400
parents 6e589f267c14
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
6e589f267c14 Uploaded
devteam
parents:
diff changeset
1 <tool id="fml_bed2gff" name="BED-to-GFF" version="2.0.0">
6e589f267c14 Uploaded
devteam
parents:
diff changeset
2 <description>converter</description>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
3 <command interpreter="python">bed_to_gff.py $inf_bed &gt; $gff_format
6e589f267c14 Uploaded
devteam
parents:
diff changeset
4 </command>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
5 <inputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
6 <param format="bed" name="inf_bed" type="data" label="Convert this query" help="Provide genome annotation in 12 column BED format."/>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
7 </inputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
8 <outputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
9 <data format="gff3" name="gff_format" label="${tool.name} on ${on_string}: Converted" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
10 </outputs>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
11 <tests>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
12 <test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
13 <param name="inf_bed" value="ccds_genes.bed" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
14 <output name="gff_format" file="ccds_genes.gff3" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
15 </test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
16 <test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
17 <param name="inf_bed" value="hs_2009.bed" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
18 <output name="gff_format" file="hs_2009.gff3" />
6e589f267c14 Uploaded
devteam
parents:
diff changeset
19 </test>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
20 </tests>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
21 <help>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
22
6e589f267c14 Uploaded
devteam
parents:
diff changeset
23 **What it does**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
24
6e589f267c14 Uploaded
devteam
parents:
diff changeset
25 This tool converts data from a 12 column UCSC wiggle BED format to GFF3 (scroll down for format description).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
26
6e589f267c14 Uploaded
devteam
parents:
diff changeset
27 --------
6e589f267c14 Uploaded
devteam
parents:
diff changeset
28
6e589f267c14 Uploaded
devteam
parents:
diff changeset
29 **Example**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
30
6e589f267c14 Uploaded
devteam
parents:
diff changeset
31 - The following data in UCSC Wiggle BED format::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
32
6e589f267c14 Uploaded
devteam
parents:
diff changeset
33 chr1 11873 14409 uc001aaa.3 0 + 11873 11873 0 3 354,109,1189, 0,739,1347,
6e589f267c14 Uploaded
devteam
parents:
diff changeset
34
6e589f267c14 Uploaded
devteam
parents:
diff changeset
35 - Will be converted to GFF3::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
36
6e589f267c14 Uploaded
devteam
parents:
diff changeset
37 ##gff-version 3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
38 chr1 bed2gff gene 11874 14409 0 + . ID=Gene:uc001aaa.3;Name=Gene:uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
39 chr1 bed2gff transcript 11874 14409 0 + . ID=uc001aaa.3;Name=uc001aaa.3;Parent=Gene:uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
40 chr1 bed2gff exon 11874 12227 0 + . Parent=uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
41 chr1 bed2gff exon 12613 12721 0 + . Parent=uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
42 chr1 bed2gff exon 13221 14409 0 + . Parent=uc001aaa.3
6e589f267c14 Uploaded
devteam
parents:
diff changeset
43
6e589f267c14 Uploaded
devteam
parents:
diff changeset
44 --------
6e589f267c14 Uploaded
devteam
parents:
diff changeset
45
6e589f267c14 Uploaded
devteam
parents:
diff changeset
46 **About formats**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
47
6e589f267c14 Uploaded
devteam
parents:
diff changeset
48 **BED format** Browser Extensible Data format was designed at UCSC for displaying data tracks in the Genome Browser. It has three required fields and several additional optional ones:
6e589f267c14 Uploaded
devteam
parents:
diff changeset
49
6e589f267c14 Uploaded
devteam
parents:
diff changeset
50 The first three BED fields (required) are::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
51
6e589f267c14 Uploaded
devteam
parents:
diff changeset
52 1. chrom - The name of the chromosome (e.g. chr1, chrY_random).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
53 2. chromStart - The starting position in the chromosome. (The first base in a chromosome is numbered 0.)
6e589f267c14 Uploaded
devteam
parents:
diff changeset
54 3. chromEnd - The ending position in the chromosome, plus 1 (i.e., a half-open interval).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
55
6e589f267c14 Uploaded
devteam
parents:
diff changeset
56 The additional BED fields (optional) are::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
57
6e589f267c14 Uploaded
devteam
parents:
diff changeset
58 4. name - The name of the BED line.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
59 5. score - A score between 0 and 1000.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
60 6. strand - Defines the strand - either '+' or '-'.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
61 7. thickStart - The starting position where the feature is drawn thickly at the Genome Browser.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
62 8. thickEnd - The ending position where the feature is drawn thickly at the Genome Browser.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
63 9. reserved - This should always be set to zero.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
64 10. blockCount - The number of blocks (exons) in the BED line.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
65 11. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
66 12. blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
67
6e589f267c14 Uploaded
devteam
parents:
diff changeset
68 **GFF3 format** General Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF3 lines have nine tab-separated fields::
6e589f267c14 Uploaded
devteam
parents:
diff changeset
69
6e589f267c14 Uploaded
devteam
parents:
diff changeset
70 1. seqid - Must be a chromosome or scaffold or contig.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
71 2. source - The program that generated this feature.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
72 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
6e589f267c14 Uploaded
devteam
parents:
diff changeset
73 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
74 5. stop - The ending position of the feature (inclusive).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
75 6. score - A score between 0 and 1000. If there is no score value, enter ".".
6e589f267c14 Uploaded
devteam
parents:
diff changeset
76 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
6e589f267c14 Uploaded
devteam
parents:
diff changeset
77 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
78 9. attributes - All lines with the same group are linked together into a single item.
6e589f267c14 Uploaded
devteam
parents:
diff changeset
79
6e589f267c14 Uploaded
devteam
parents:
diff changeset
80 --------
6e589f267c14 Uploaded
devteam
parents:
diff changeset
81
6e589f267c14 Uploaded
devteam
parents:
diff changeset
82 **Copyright**
6e589f267c14 Uploaded
devteam
parents:
diff changeset
83
6e589f267c14 Uploaded
devteam
parents:
diff changeset
84 2009-2014 Max Planck Society, University of Tübingen &amp; Memorial Sloan Kettering Cancer Center
6e589f267c14 Uploaded
devteam
parents:
diff changeset
85
6e589f267c14 Uploaded
devteam
parents:
diff changeset
86 Sreedharan VT, Schultheiss SJ, Jean G, Kahles A, Bohnert R, Drewe P, Mudrakarta P, Görnitz N, Zeller G, Rätsch G. Oqtans: the RNA-seq workbench in the cloud for complete and reproducible quantitative transcriptome analysis. Bioinformatics 10.1093/bioinformatics/btt731 (2014)
6e589f267c14 Uploaded
devteam
parents:
diff changeset
87
6e589f267c14 Uploaded
devteam
parents:
diff changeset
88 </help>
6e589f267c14 Uploaded
devteam
parents:
diff changeset
89 </tool>