annotate tools/taxonomy/gi2taxonomy.xml @ 1:cdcb0ce84a1b

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:45:15 -0500
parents 9071e359b9a3
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="Fetch Taxonomic Ranks" name="Fetch taxonomic representation" version="1.1.0">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description></description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <requirements>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <requirement type="package">taxonomy</requirement>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 </requirements>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 <command interpreter="python">gi2taxonomy.py $input $giField $idField $out_file1 ${GALAXY_DATA_INDEX_DIR}</command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 <param format="tabular" name="input" type="data" label="Show taxonomic representation for"></param>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 <param name="giField" label="GIs column" type="data_column" data_ref="input" numerical="True" help="select column containing GI numbers"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <param name="idField" label="Name column" type="data_column" data_ref="input" help="select column containing identifiers you want to include into output"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 <data format="taxonomy" name="out_file1" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 <requirements>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <requirement type="binary">taxBuilder</requirement>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 </requirements>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 <param name="input" ftype="tabular" value="taxonomy2gi-input.tabular"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 <param name="giField" value="1"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 <param name="idField" value="2"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 <output name="out_file1" file="taxonomy2gi-output.tabular"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29 .. class:: infomark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31 Use *Filter and Sort->Filter* to restrict output of this tool to desired taxonomic ranks. You can also use *Text Manipulation->Cut* to remove unwanted columns from the output.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 **What it does**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 Fetches taxonomic information for a list of GI numbers (sequences identifiers used by the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov).
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 -------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 **Example**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 Suppose you have BLAST output that looks like this::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 +-----------------------+----------+----------+-----------------+------------+------+--------+
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46 | queryId | targetGI | identity | alignmentLength | mismatches | gaps | score |
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 +-----------------------+----------+----------+-----------------+------------+------+--------+
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48 | 1L_EYKX4VC01BXWX1_265 | 1430919 | 90.09 | 212 | 15 | 6 | 252.00 |
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 +-----------------------+----------+----------+-----------------+------------+------+--------+
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 and you want to obtain full taxonomic representation for GIs listed in *targetGI* column. If you set parameters as shown here:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 .. image:: ./static/images/fetchTax.png
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56 the tool will generate the following output (you may need to scroll sideways to see the entire line)::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 1L_EYKX4VC01BXWX1_265 9606 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n 1430919
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 In other words the tool printed *Name column*, *taxonomy Id*, appended 22 columns containing taxonomic ranks from Superkingdom to Subspecies and added *GI* as the last column. Below is a formal definition of the output columns::
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 Column Definition
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64 ------- ------------------------------------------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65 1 Name (specified by 'Name column' dropdown)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66 2 GI (specified by 'GI column' dropdown)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67 3 root
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68 4 superkingdom
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69 5 kingdom
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70 6 subkingdom
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 7 superphylum
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
72 8 phylum
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
73 9 subphylum
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
74 10 superclass
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
75 11 class
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
76 12 subclass
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
77 13 superorder
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
78 14 order
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
79 15 suborder
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
80 16 superfamily
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
81 17 family
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
82 18 subfamily
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
83 19 tribe
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
84 20 subtribe
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
85 21 genus
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
86 22 subgenus
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
87 23 species
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
88 24 subspecies
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
89
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
90 ------
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
91
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
92 .. class:: warningmark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
93
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
94 **Why do I have these "n" things?**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
95
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
96 Be aware that the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) this tool relies upon is incomplete. This means that for many species one or more ranks are absent and represented as "**n**". In the above example *subkingdom*, *superphylum* etc. are missing.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
97
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
98
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
99 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
100 </tool>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
101
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
102