Galaxy | Tool Preview

Fetch taxonomic representation (version 1.1.0)
select column containing GI numbers
select column containing identifiers you want to include into output

Use Filter and Sort->Filter to restrict output of this tool to desired taxonomic ranks. You can also use Text Manipulation->Cut to remove unwanted columns from the output.


What it does

Fetches taxonomic information for a list of GI numbers (sequences identifiers used by the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov).


Example

Suppose you have BLAST output that looks like this:

+-----------------------+----------+----------+-----------------+------------+------+--------+
| queryId               | targetGI | identity | alignmentLength | mismatches | gaps | score  |
+-----------------------+----------+----------+-----------------+------------+------+--------+
| 1L_EYKX4VC01BXWX1_265 |  1430919 |    90.09 |             212 |         15 |    6 | 252.00 |
+-----------------------+----------+----------+-----------------+------------+------+--------+

and you want to obtain full taxonomic representation for GIs listed in targetGI column. If you set parameters as shown here:

/repository/static/images/fe76175bf9c49d42/images%2FfetchTax.png

the tool will generate the following output (you may need to scroll sideways to see the entire line):

1                     2    3    4         5       6 7 8        9        10            11       12 13               14       15         16          17        18  19  20 21  22  23           24 25
1L_EYKX4VC01BXWX1_265 9606 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n  Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n   n   n  Homo n  Homo sapiens n  1430919

In other words the tool printed Name column, taxonomy Id, appended 22 columns containing taxonomic ranks from Superkingdom to Subspecies and added GI as the last column. Below is a formal definition of the output columns:

 Column Definition
------- ------------------------------------------
      1 Name (specified by 'Name column' dropdown)
      2 GI   (specified by 'GI column' dropdown)
      3 root
      4 superkingdom
      5 kingdom
      6 subkingdom
      7 superphylum
      8 phylum
      9 subphylum
     10 superclass
     11 class
     12 subclass
     13 superorder
     14 order
     15 suborder
     16 superfamily
     17 family
     18 subfamily
     19 tribe
     20 subtribe
     21 genus
     22 subgenus
     23 species
     24 subspecies

Why do I have these "n" things?

Be aware that the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) this tool relies upon is incomplete. This means that for many species one or more ranks are absent and represented as "n". In the above example subkingdom, superphylum etc. are missing.