comparison tools/ncbi_blast_plus/README.rst @ 11:4c4a0da938ff draft

Uploaded v0.0.22, now wraps BLAST+ 2.2.28 allowing extended tabular output to include the hit descriptions as column 25. Supports $GALAXY_SLOTS. Includes more tests and heavy use of macros.
author peterjc
date Thu, 05 Dec 2013 06:55:59 -0500
parents 70e7dcbf6573
children 623f727cdff1
comparison
equal deleted inserted replaced
10:70e7dcbf6573 11:4c4a0da938ff
3 3
4 These wrappers are copyright 2010-2013 by Peter Cock, The James Hutton Institute 4 These wrappers are copyright 2010-2013 by Peter Cock, The James Hutton Institute
5 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. 5 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
6 See the licence text below. 6 See the licence text below.
7 7
8 Currently tested with NCBI BLAST 2.2.26+ (i.e. version 2.2.26 of BLAST+), 8 Currently tested with NCBI BLAST 2.2.28+ (i.e. version 2.2.28 of BLAST+),
9 and does not work with the NCBI 'legacy' BLAST suite (e.g. blastall). 9 and does not work with the NCBI 'legacy' BLAST suite (e.g. ``blastall``).
10 10
11 Note that these wrappers (and the associated datatypes) were originally 11 Note that these wrappers (and the associated datatypes) were originally
12 distributed as part of the main Galaxy repository, but as of August 2012 12 distributed as part of the main Galaxy repository, but as of August 2012
13 moved to the Galaxy Tool Shed as 'ncbi_blast_plus' (and 'blast_datatypes'). 13 moved to the Galaxy Tool Shed as ``ncbi_blast_plus`` (and ``blast_datatypes``).
14 My thanks to Dannon Baker from the Galaxy development team for his assistance 14 My thanks to Dannon Baker from the Galaxy development team for his assistance
15 with this. 15 with this.
16 16
17 These wrappers are available from the Galaxy Tool Shed at: 17 These wrappers are available from the Galaxy Tool Shed at:
18 http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus 18 http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
20 20
21 Automated Installation 21 Automated Installation
22 ====================== 22 ======================
23 23
24 Galaxy should be able to automatically install the dependencies, i.e. the 24 Galaxy should be able to automatically install the dependencies, i.e. the
25 'blast_datatypes' repository which defines the BLAST XML file format 25 ``blast_datatypes`` repository which defines the BLAST XML file format
26 ('blastxml') and protein and nucleotide BLAST databases ('blastdbp' and 26 (``blastxml``) and protein and nucleotide BLAST databases (``blastdbp`` and
27 'blastdbn'). 27 ``blastdbn``).
28 28
29 You must tell Galaxy about any system level BLAST databases using configuration 29 You must tell Galaxy about any system level BLAST databases using configuration
30 files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein 30 files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein
31 databases like NR), and blastdb_d.loc (protein domain databases like CDD or 31 databases like NR), and blastdb_d.loc (protein domain databases like CDD or
32 SMART) which are located in the tool-data/ folder. Sample files are included 32 SMART) which are located in the tool-data/ folder. Sample files are included
40 40
41 Manual Installation 41 Manual Installation
42 =================== 42 ===================
43 43
44 For those not using Galaxy's automated installation from the Tool Shed, put 44 For those not using Galaxy's automated installation from the Tool Shed, put
45 the XML and Python files in the tools/ncbi_blast_plus/ folder and add the XML 45 the XML and Python files in the ``tools/ncbi_blast_plus/`` folder and add the
46 files to your tool_conf.xml as normal (and do the same in tool_conf.xml.sample 46 XML files to your ``tool_conf.xml`` as normal (and do the same in
47 in order to run the unit tests). For example, use:: 47 ``tool_conf.xml.sample`` in order to run the unit tests). For example, use::
48 48
49 <section name="NCBI BLAST+" id="ncbi_blast_plus_tools"> 49 <section name="NCBI BLAST+" id="ncbi_blast_plus_tools">
50 <tool file="ncbi_blast_plus/ncbi_blastn_wrapper.xml" /> 50 <tool file="ncbi_blast_plus/ncbi_blastn_wrapper.xml" />
51 <tool file="ncbi_blast_plus/ncbi_blastp_wrapper.xml" /> 51 <tool file="ncbi_blast_plus/ncbi_blastp_wrapper.xml" />
52 <tool file="ncbi_blast_plus/ncbi_blastx_wrapper.xml" /> 52 <tool file="ncbi_blast_plus/ncbi_blastx_wrapper.xml" />
53 <tool file="ncbi_blast_plus/ncbi_tblastn_wrapper.xml" /> 53 <tool file="ncbi_blast_plus/ncbi_tblastn_wrapper.xml" />
54 <tool file="ncbi_blast_plus/ncbi_tblastx_wrapper.xml" /> 54 <tool file="ncbi_blast_plus/ncbi_tblastx_wrapper.xml" />
55 <tool file="ncbi_blast_plus/ncbi_makeblastdb.xml" /> 55 <tool file="ncbi_blast_plus/ncbi_makeblastdb.xml" />
56 <tool file="ncbi_blast_plus/ncbi_dustmasker_wrapper.xml" />
56 <tool file="ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml" /> 57 <tool file="ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml" />
57 <tool file="ncbi_blast_plus/ncbi_blastdbcmd_info.xml" /> 58 <tool file="ncbi_blast_plus/ncbi_blastdbcmd_info.xml" />
58 <tool file="ncbi_blast_plus/ncbi_rpsblast_wrapper.xml" /> 59 <tool file="ncbi_blast_plus/ncbi_rpsblast_wrapper.xml" />
59 <tool file="ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml" /> 60 <tool file="ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml" />
60 <tool file="ncbi_blast_plus/blastxml_to_tabular.xml" /> 61 <tool file="ncbi_blast_plus/blastxml_to_tabular.xml" />
61 </section> 62 </section>
62 63
63 You will also need to install 'blast_datatypes' from the Tool Shed. This 64 You will also need to install ``blast_datatypes`` from the Tool Shed. This
64 defines the BLAST XML file format ('blastxml') and protein and nucleotide 65 defines the BLAST XML file format (``blastxml``) and protein and nucleotide
65 BLAST databases composite file formats ('blastdbp' and 'blastdbn'). 66 BLAST databases composite file formats (``blastdbp`` and ``blastdbn``):
67
68 * http://toolshed.g2.bx.psu.edu/view/devteam/blast_datatypes
66 69
67 As described above for an automated installation, you must also tell Galaxy 70 As described above for an automated installation, you must also tell Galaxy
68 about any system level BLAST databases using the tool-data/blastdb*.loc files. 71 about any system level BLAST databases using the ``tool-data/blastdb*.loc``
72 files.
69 73
70 You must install the NCBI BLAST+ standalone tools somewhere on the system 74 You must install the NCBI BLAST+ standalone tools somewhere on the system
71 path. Currently the unit tests are written using "BLAST 2.2.26+". 75 path. Currently the unit tests are written using "BLAST 2.2.28+".
72 76
73 Run the functional tests (adjusting the section identifier to match your 77 Run the functional tests (adjusting the section identifier to match your
74 tool_conf.xml.sample file):: 78 ``tool_conf.xml.sample`` file)::
75 79
76 ./run_functional_tests.sh -sid NCBI_BLAST+-ncbi_blast_plus_tools 80 ./run_functional_tests.sh -sid NCBI_BLAST+-ncbi_blast_plus_tools
77 81
78 82
79 History 83 History
115 - Tweak dependency on blast_datatypes to also work on Test Tool Shed. 119 - Tweak dependency on blast_datatypes to also work on Test Tool Shed.
116 - Dependency on new package_blast_plus_2_2_26 in Tool Shed. 120 - Dependency on new package_blast_plus_2_2_26 in Tool Shed.
117 - Adopted standard MIT License. 121 - Adopted standard MIT License.
118 - Development moved to GitHub, https://github.com/peterjc/galaxy_blast 122 - Development moved to GitHub, https://github.com/peterjc/galaxy_blast
119 - Updated citation information (Cock et al. 2013). 123 - Updated citation information (Cock et al. 2013).
124 v0.0.21 - Use macros to simplify the XML wrappers.
125 - Added wrapper for dustmasker
126 - Enabled masking for makeblastdb
127 - Requires 'maskinfo-asn1' and 'maskinfo-asn1-binary' datatypes
128 defined in updated blast_datatypes on Galaxy ToolShed.
129 - Tests updated for BLAST+ 2.2.27 instead of BLAST+ 2.2.26
130 - Now depends on package_blast_plus_2_2_27 in ToolShed
131 v0.0.22 - More use macros to simplify the wrappers
132 - Set number of threads via $GALAXY_SLOTS environment variable
133 - More descriptive default output names
134 - Tests require updated BLAST DB definitions (blast_datatypes v0.0.18)
135 - Pre-check for duplicate identifiers in makeblastdb wrapper.
136 - Tests updated for BLAST+ 2.2.28 instead of BLAST+ 2.2.27
137 - Now depends on package_blast_plus_2_2_28 in ToolShed
138 - Extended tabular output includes 'salltitles' as column 25.
120 ======= ====================================================================== 139 ======= ======================================================================
121 140
122 141
123 Bug Reports 142 Bug Reports
124 =========== 143 ===========
138 https://github.com/peterjc/galaxy_blast 157 https://github.com/peterjc/galaxy_blast
139 158
140 For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use 159 For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use
141 the following command from the GitHub repository root folder:: 160 the following command from the GitHub repository root folder::
142 161
143 $ ./ncbi_blast_plus/make_ncbi_blast_plus.sh 162 $ tools/ncbi_blast_plus/make_ncbi_blast_plus.sh
144 163
145 This simplifies ensuring a consistent set of files is bundled each time, 164 This simplifies ensuring a consistent set of files is bundled each time,
146 including all the relevant test files. 165 including all the relevant test files.
166
167 When updating the version of BLAST+, many of the sample data files used for
168 the unit tests must be regenerated. This script automates that task::
169
170 $ tools/ncbi_blast_plus/update_test_files.sh
147 171
148 172
149 Licence (MIT) 173 Licence (MIT)
150 ============= 174 =============
151 175