annotate test-data/test-db/readme.txt @ 0:0fd79958fac6 draft

planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
author iuc
date Fri, 26 Jul 2024 09:26:02 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
1 *.dmp files are bcp-like dump from GenBank taxonomy database.
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
2
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
3 General information.
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
4 Field terminator is "\t|\t"
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
5 Row terminator is "\t|\n"
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
6
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
7 nodes.dmp file consists of taxonomy nodes. The description for each node includes the following
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
8 fields:
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
9 tax_id -- node id in GenBank taxonomy database
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
10 parent tax_id -- parent node id in GenBank taxonomy database
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
11 rank -- rank of this node (superkingdom, kingdom, ...)
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
12 embl code -- locus-name prefix; not unique
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
13 division id -- see division.dmp file
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
14 inherited div flag (1 or 0) -- 1 if node inherits division from parent
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
15 genetic code id -- see gencode.dmp file
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
16 inherited GC flag (1 or 0) -- 1 if node inherits genetic code from parent
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
17 mitochondrial genetic code id -- see gencode.dmp file
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
18 inherited MGC flag (1 or 0) -- 1 if node inherits mitochondrial gencode from parent
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
19 GenBank hidden flag (1 or 0) -- 1 if name is suppressed in GenBank entry lineage
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
20 hidden subtree root flag (1 or 0) -- 1 if this subtree has no sequence data yet
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
21 comments -- free-text comments and citations
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
22
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
23 Taxonomy names file (names.dmp):
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
24 tax_id -- the id of node associated with this name
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
25 name_txt -- name itself
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
26 unique name -- the unique variant of this name if name not unique
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
27 name class -- (synonym, common name, ...)
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
28
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
29 Divisions file (division.dmp):
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
30 division id -- taxonomy database division id
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
31 division cde -- GenBank division code (three characters)
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
32 division name -- e.g. BCT, PLN, VRT, MAM, PRI...
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
33 comments
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
34
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
35 Genetic codes file (gencode.dmp):
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
36 genetic code id -- GenBank genetic code id
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
37 abbreviation -- genetic code name abbreviation
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
38 name -- genetic code name
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
39 cde -- translation table for this genetic code
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
40 starts -- start codons for this genetic code
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
41
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
42 Deleted nodes file (delnodes.dmp):
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
43 tax_id -- deleted node id
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
44
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
45 Merged nodes file (merged.dmp):
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
46 old_tax_id -- id of nodes which has been merged
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
47 new_tax_id -- id of nodes which is result of merging
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
48
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
49 Citations file (citations.dmp):
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
50 cit_id -- the unique id of citation
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
51 cit_key -- citation key
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
52 pubmed_id -- unique id in PubMed database (0 if not in PubMed)
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
53 medline_id -- unique id in MedLine database (0 if not in MedLine)
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
54 url -- URL associated with citation
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
55 text -- any text (usually article name and authors).
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
56 -- The following characters are escaped in this text by a backslash:
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
57 -- newline (appear as "\n"),
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
58 -- tab character ("\t"),
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
59 -- double quotes ('\"'),
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
60 -- backslash character ("\\").
0fd79958fac6 planemo upload for repository https://github.com/shenwei356/taxonkit commit 695ea582a8d3bf7845dd4cddbc8b591e4b6c4e82
iuc
parents:
diff changeset
61 taxid_list -- list of node ids separated by a single space