annotate gbk_to_gff.xml @ 10:c42c69aa81f8

fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
author vipints <vipin@cbio.mskcc.org>
date Thu, 23 Apr 2015 18:01:45 -0400
parents
children 5c6f33e20fcc
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
10
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
1 <tool id="fml_gbk2gff" name="GBK-to-GFF" version="2.1.0">
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
2 <description>converter</description>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
3 <command interpreter="python">gbk_to_gff.py $inf_gbk &gt; $gff_format
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
4 </command>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
5 <inputs>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
6 <param format="gb,gbk,genbank" name="inf_gbk" type="data" label="Convert this query" help="GenBank flat file format consists of an annotation section and a sequence section."/>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
7 </inputs>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
8 <outputs>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
9 <data format="gff" name="gff_format" label="${tool.name} on ${on_string}: Converted"/>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
10 </outputs>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
11 <tests>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
12 <test>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
13 <param name="inf_gbk" value="s_cerevisiae_SCU49845.gbk" />
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
14 <output name="gff_format" file="s_cerevisiae_SCU49845.gff" />
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
15 </test>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
16 </tests>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
17 <help>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
18
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
19 **What it does**
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
20
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
21 This tool converts data from a GenBank_ flat file format to GFF (scroll down for format description).
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
22
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
23 .. _GenBank: http://www.ncbi.nlm.nih.gov/genbank/
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
24
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
25 ------
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
26
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
27 **Example**
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
28
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
29 - The following data in GenBank format::
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
30
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
31 LOCUS NM_001202705 2406 bp mRNA linear PLN 28-MAY-2011
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
32 DEFINITION Arabidopsis thaliana thiamine biosynthesis protein ThiC (THIC)
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
33 mRNA, complete cds.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
34 ACCESSION NM_001202705
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
35 VERSION NM_001202705.1 GI:334184566.........
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
36 FEATURES Location/Qualifiers
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
37 source 1..2406
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
38 /organism="Arabidopsis thaliana"
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
39 /mol_type="mRNA"
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
40 /db_xref="taxon:3702"........
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
41 gene 1..2406
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
42 /gene="THIC"
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
43 /locus_tag="AT2G29630"
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
44 /gene_synonym="PY; PYRIMIDINE REQUIRING; T27A16.27;........
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
45 ORIGIN
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
46 1 aagcctttcg ctttaggctg cattgggccg tgacaatatt cagacgattc aggaggttcg
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
47 61 ttcctttttt aaaggaccct aatcactctg agtaccactg actcactcag tgtgcgcgat
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
48 121 tcatttcaaa aacgagccag cctcttcttc cttcgtctac tagatcagat ccaaagcttc
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
49 181 ctcttccagc tatggctgct tcagtacact gtaccttgat gtccgtcgta tgcaacaaca
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
50 //
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
51
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
52
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
53 - Will be converted to GFF3::
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
54
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
55 NM_001202705 gbk2gff chromosome 1 2406 . + 1 ID=NM_001202705;Alias=2;Dbxref=taxon:3702;Name=NM_001202705
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
56 NM_001202705 gbk2gff gene 1 2406 . + 1 ID=AT2G29630;Dbxref=GeneID:817513,TAIR:AT2G29630;Name=THIC
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
57 NM_001202705 gbk2gff mRNA 192 2126 . + 1 ID=AT2G29630.t01;Parent=AT2G29630
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
58 NM_001202705 gbk2gff CDS 192 2126 . + 1 ID=AT2G29630.p01;Parent=AT2G29630.t01
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
59 NM_001202705 gbk2gff exon 192 2126 . + 1 Parent=AT2G29630.t01
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
60
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
61 ------
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
62
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
63 **Reference**
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
64
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
65 **GBK-to-GFF** is part of oqtans package and cited as [1]_.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
66
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
67 .. [1] Sreedharan VT, Schultheiss SJ, Jean G et.al., Oqtans: the RNA-seq workbench in the cloud for complete and reproducible quantitative transcriptome analysis. Bioinformatics (2014). `10.1093/bioinformatics/btt731`_
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
68
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
69 .. _10.1093/bioinformatics/btt731: http://goo.gl/I75poH
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
70
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
71 ------
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
72
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
73 **About file formats**
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
74
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
75 **GenBank format** An example of a GenBank record may be viewed here_
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
76
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
77 .. _here: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
78
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
79 **GFF** Generic Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF lines have nine tab-separated fields::
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
80
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
81 1. seqid - Must be a chromosome or scaffold or contig.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
82 2. source - The program that generated this feature.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
83 3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
84 4. start - The starting position of the feature in the sequence. The first base is numbered 1.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
85 5. stop - The ending position of the feature (inclusive).
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
86 6. score - A score between 0 and 1000. If there is no score value, enter ".".
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
87 7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
88 8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
89 9. attributes - All lines with the same group are linked together into a single item.
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
90
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
91 --------
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
92
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
93 **Copyright**
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
94
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
95 GBK-to-GFF Wrapper Version 0.6 (Apr 2015)
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
96
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
97 2009-2015 Max Planck Society, University of Tübingen &amp; Memorial Sloan Kettering Cancer Center
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
98
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
99 </help>
c42c69aa81f8 fixed manually the upload of version 2.1.0 - deleted accidentally added files to the repo
vipints <vipin@cbio.mskcc.org>
parents:
diff changeset
100 </tool>