comparison tools/seq_select_by_id/README.rst @ 4:6842c0c7bc70 draft

Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
author peterjc
date Mon, 28 Oct 2013 05:21:45 -0400
parents
children 1a83f5ab9e95
comparison
equal deleted inserted replaced
3:19e26966ed3e 4:6842c0c7bc70
1 Galaxy tool to select FASTA, QUAL, FASTQ or SFF sequences by ID
2 ===============================================================
3
4 This tool is copyright 2011-2013 by Peter Cock, The James Hutton Institute
5 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
6 See the licence text below.
7
8 This tool is a short Python script (using Biopython library functions) to extract
9 sequences from a FASTA, QUAL, FASTQ, or SFF file based on the list of IDs given
10 by a column of a tabular file. The output order follows that of the tabular file,
11 and if there are duplicates in the tabular file, there will be duplicates in the
12 output sequence file.
13
14 This tool is available from the Galaxy Tool Shed at:
15
16 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_select_by_id
17
18 See also the sister tools to filter sequence files according to IDs from column(s)
19 of a tabular file (where the output order follows the sequence file, and any
20 duplicate IDs are ignored) and rename sequences:
21
22 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id
23 * http://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename
24
25
26 Automated Installation
27 ======================
28
29 This should be straightforward using the Galaxy Tool Shed, which should be
30 able to automatically install the dependency on Biopython, and then install
31 this tool and run its unit tests.
32
33
34 Manual Installation
35 ===================
36
37 There are just two files to install to use this tool from within Galaxy:
38
39 * seq_select_by_id.py (the Python script)
40 * seq_select_by_id.xml (the Galaxy tool definition)
41
42 The suggested location is a dedicated tools/seq_select_by_id folder.
43
44 You will also need to modify the tools_conf.xml file to tell Galaxy to offer the
45 tool. One suggested location is in the filters section. Simply add the line::
46
47 <tool file="seq_select_by_id/seq_select_by_id.xml" />
48
49 If you wish to run the unit tests, also add this to tools_conf.xml.sample
50 and move/copy the test-data files under Galaxy's test-data folder. Then::
51
52 $ ./run_functional_tests.sh -id seq_select_by_id
53
54 You will also need to install Biopython 1.54 or later. That's it.
55
56
57 History
58 =======
59
60 ======= ======================================================================
61 Version Changes
62 ------- ----------------------------------------------------------------------
63 v0.0.1 - Initial version.
64 v0.0.3 - Ignore blank lines in input.
65 v0.0.4 - Record script version when run from Galaxy.
66 - Basic unit test included.
67 v0.0.5 - Check for errors using Python script's return code.
68 v0.0.6 - Link to Tool Shed added to help text and this documentation.
69 - Automatic installation of Biopython dependency.
70 - Use reStructuredText for this README file.
71 - Adopt standard MIT License.
72 v0.0.7 - Updated citation information (Cock et al. 2013).
73 - Fixed Biopython dependency setup.
74 - Development moved to GitHub, https://github.com/peterjc/pico_galaxy
75 - Renamed folder and adopted README.rst naming.
76 ======= ======================================================================
77
78
79 Developers
80 ==========
81
82 This script and related tools were initially developed on the following hg branch:
83 http://bitbucket.org/peterjc/galaxy-central/src/tools
84
85 Development has now moved to a dedicated GitHub repository:
86 https://github.com/peterjc/pico_galaxy/tree/master/tools
87
88 For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball use
89 the following command from the Galaxy root folder::
90
91 $ tar -czf seq_select_by_id.tar.gz tools/seq_select_by_id/README.rst tools/seq_select_by_id/seq_select_by_id.* tools/seq_select_by_id/repository_dependencies.xml test-data/k12_ten_proteins.fasta test-data/k12_hypothetical.fasta test-data/k12_hypothetical.tabular
92
93 Check this worked::
94
95 $ tar -tzf seq_select_by_id.tar.gz
96 tools/seq_select_by_id/README.rst
97 tools/seq_select_by_id/seq_select_by_id.py
98 tools/seq_select_by_id/seq_select_by_id.xml
99 tools/seq_select_by_id/repository_dependencies.xml
100 test-data/k12_ten_proteins.fasta
101 test-data/k12_hypothetical.fasta
102 test-data/k12_hypothetical.tabular
103
104
105 Licence (MIT)
106 =============
107
108 Permission is hereby granted, free of charge, to any person obtaining a copy
109 of this software and associated documentation files (the "Software"), to deal
110 in the Software without restriction, including without limitation the rights
111 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
112 copies of the Software, and to permit persons to whom the Software is
113 furnished to do so, subject to the following conditions:
114
115 The above copyright notice and this permission notice shall be included in
116 all copies or substantial portions of the Software.
117
118 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
119 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
120 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
121 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
122 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
123 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
124 THE SOFTWARE.