annotate tools/filters/gff2bed.xml @ 1:cdcb0ce84a1b

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:45:15 -0500
parents 9071e359b9a3
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="gff2bed1" name="GFF-to-BED" version="1.0.1">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description>converter</description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">gff_to_bed_converter.py $input $out_file1</command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 <param format="gff" name="input" type="data" label="Convert this query"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 <data format="bed" name="out_file1" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <param name="input" value="5.gff" ftype="gff"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 <output name="out_file1" file="gff2bed_out.bed"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <param name="input" value="gff2bed_in2.gff" ftype="gff"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 <output name="out_file1" file="gff2bed_out2.bed"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 <!-- Test conversion of gff3 file. -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 <param name="input" value="5.gff3" ftype="gff"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 <output name="out_file1" file="gff2bed_out3.bed"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27 **What it does**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29 This tool converts data from GFF format to BED format (scroll down for format description).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31 --------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33 **Example**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 The following data in GFF format::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 chr22 GeneA enhancer 10000000 10001000 500 + . TGA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 chr22 GeneA promoter 10010000 10010100 900 + . TGA
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40 Will be converted to BED (**note** that 1 is subtracted from the start coordinate)::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42 chr22 9999999 10001000 enhancer 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 chr22 10009999 10010100 promoter 0 +
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 .. class:: infomark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 **About formats**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 **BED format** Browser Extensible Data format was designed at UCSC for displaying data tracks in the Genome Browser. It has three required fields and several additional optional ones:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 The first three BED fields (required) are::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 1. chrom - The name of the chromosome (e.g. chr1, chrY_random).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56 2. chromStart - The starting position in the chromosome. (The first base in a chromosome is numbered 0.)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 3. chromEnd - The ending position in the chromosome, plus 1 (i.e., a half-open interval).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 The additional BED fields (optional) are::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 4. name - The name of the BED line.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62 5. score - A score between 0 and 1000.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 6. strand - Defines the strand - either '+' or '-'.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64 7. thickStart - The starting position where the feature is drawn thickly at the Genome Browser.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65 8. thickEnd - The ending position where the feature is drawn thickly at the Genome Browser.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66 9. reserved - This should always be set to zero.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67 10. blockCount - The number of blocks (exons) in the BED line.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68 11. blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69 12. blockStarts - A comma-separated list of block starts. All of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70 13. expCount - The number of experiments.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 14. expIds - A comma-separated list of experiment ids. The number of items in this list should correspond to expCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
72 15. expScores - A comma-separated list of experiment scores. All of the expScores should be relative to expIds. The number of items in this list should correspond to expCount.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
73
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
74 **GFF format** General Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF lines have nine tab-separated fields::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
75
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
76 1. seqname - Must be a chromosome or scaffold.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
77 2. source - The program that generated this feature.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
78 3. feature - The name of this type of feature. Some examples of standard feature types are "CDS", "start_codon", "stop_codon", and "exon".
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
79 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
80 5. end - The ending position of the feature (inclusive).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
81 6. score - A score between 0 and 1000. If there is no score value, enter ".".
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
82 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
83 8. frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
84 9. group - All lines with the same group are linked together into a single item.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
85
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
86 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
87 </tool>