comparison test-data/test-db/readme.txt @ 0:0fd79958fac6 draft

planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
author iuc
date Fri, 26 Jul 2024 09:26:02 +0000
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:0fd79958fac6
1 *.dmp files are bcp-like dump from GenBank taxonomy database.
2
3 General information.
4 Field terminator is "\t|\t"
5 Row terminator is "\t|\n"
6
7 nodes.dmp file consists of taxonomy nodes. The description for each node includes the following
8 fields:
9 tax_id -- node id in GenBank taxonomy database
10 parent tax_id -- parent node id in GenBank taxonomy database
11 rank -- rank of this node (superkingdom, kingdom, ...)
12 embl code -- locus-name prefix; not unique
13 division id -- see division.dmp file
14 inherited div flag (1 or 0) -- 1 if node inherits division from parent
15 genetic code id -- see gencode.dmp file
16 inherited GC flag (1 or 0) -- 1 if node inherits genetic code from parent
17 mitochondrial genetic code id -- see gencode.dmp file
18 inherited MGC flag (1 or 0) -- 1 if node inherits mitochondrial gencode from parent
19 GenBank hidden flag (1 or 0) -- 1 if name is suppressed in GenBank entry lineage
20 hidden subtree root flag (1 or 0) -- 1 if this subtree has no sequence data yet
21 comments -- free-text comments and citations
22
23 Taxonomy names file (names.dmp):
24 tax_id -- the id of node associated with this name
25 name_txt -- name itself
26 unique name -- the unique variant of this name if name not unique
27 name class -- (synonym, common name, ...)
28
29 Divisions file (division.dmp):
30 division id -- taxonomy database division id
31 division cde -- GenBank division code (three characters)
32 division name -- e.g. BCT, PLN, VRT, MAM, PRI...
33 comments
34
35 Genetic codes file (gencode.dmp):
36 genetic code id -- GenBank genetic code id
37 abbreviation -- genetic code name abbreviation
38 name -- genetic code name
39 cde -- translation table for this genetic code
40 starts -- start codons for this genetic code
41
42 Deleted nodes file (delnodes.dmp):
43 tax_id -- deleted node id
44
45 Merged nodes file (merged.dmp):
46 old_tax_id -- id of nodes which has been merged
47 new_tax_id -- id of nodes which is result of merging
48
49 Citations file (citations.dmp):
50 cit_id -- the unique id of citation
51 cit_key -- citation key
52 pubmed_id -- unique id in PubMed database (0 if not in PubMed)
53 medline_id -- unique id in MedLine database (0 if not in MedLine)
54 url -- URL associated with citation
55 text -- any text (usually article name and authors).
56 -- The following characters are escaped in this text by a backslash:
57 -- newline (appear as "\n"),
58 -- tab character ("\t"),
59 -- double quotes ('\"'),
60 -- backslash character ("\\").
61 taxid_list -- list of node ids separated by a single space