Mercurial > repos > peterjc > sample_seqs
annotate tools/sample_seqs/README.rst @ 6:31f5701cd2e9 draft
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
author | peterjc |
---|---|
date | Thu, 11 May 2017 07:24:38 -0400 |
parents | 6b71ad5d43fb |
children | 86710edcec02 |
rev | line source |
---|---|
1 | 1 Galaxy tool to sub-sample sequence files |
2 ======================================== | |
0 | 3 |
6
31f5701cd2e9
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
peterjc
parents:
5
diff
changeset
|
4 This tool is copyright 2014-2017 by Peter Cock, The James Hutton Institute |
0 | 5 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. |
6 See the licence text below (MIT licence). | |
7 | |
8 This tool is a short Python script (using Biopython library functions) | |
9 to sub-sample sequence files (in a range of formats including FASTA, FASTQ, | |
10 and SFF). This can be useful for preparing a small sample of data to test | |
11 or time a new pipeline, or for reducing the read coverage in a de novo | |
12 assembly. | |
13 | |
14 This tool is available from the Galaxy Tool Shed at: | |
15 | |
16 * http://toolshed.g2.bx.psu.edu/view/peterjc/sample_seqs | |
17 | |
18 | |
19 Automated Installation | |
20 ====================== | |
21 | |
22 This should be straightforward using the Galaxy Tool Shed, which should be | |
23 able to automatically install the dependency on Biopython, and then install | |
24 this tool and run its unit tests. | |
25 | |
26 | |
27 Manual Installation | |
28 =================== | |
29 | |
30 There are just two files to install to use this tool from within Galaxy: | |
31 | |
32 * ``sample_seqs.py`` (the Python script) | |
33 * ``sample_seqs.xml`` (the Galaxy tool definition) | |
34 | |
35 The suggested location is in a dedicated ``tools/sample_seqs`` folder. | |
36 | |
37 You will also need to modify the ``tools_conf.xml`` file to tell Galaxy to offer the | |
38 tool. One suggested location is in the filters section. Simply add the line:: | |
39 | |
40 <tool file="sample_seqs/sample_seqs.xml" /> | |
41 | |
2 | 42 You will also need to install Biopython 1.62 or later. |
0 | 43 |
2 | 44 If you wish to run the unit tests, also move/copy the ``test-data/`` files |
45 under Galaxy's ``test-data/`` folder. Then:: | |
46 | |
47 ./run_tests.sh -id sample_seqs | |
0 | 48 |
49 That's it. | |
50 | |
51 | |
52 History | |
53 ======= | |
54 | |
55 ======= ====================================================================== | |
56 Version Changes | |
57 ------- ---------------------------------------------------------------------- | |
58 v0.0.1 - Initial version. | |
4 | 59 v0.1.1 - Using ``optparse`` to provide a proper Python command line API. |
2 | 60 v0.1.2 - Interleaved mode for working with paired records. |
61 - Tool definition now embeds citation information. | |
62 v0.2.0 - Option to give number of sequences (or pairs) desired. | |
63 This works by first counting all your sequences, then calculates | |
64 the percentage required in order to sample them uniformly (evenly). | |
65 This makes two passes through the input and is therefore slower. | |
3
02c13ef1a669
Uploaded v0.2.1, fixed missing test file, more tests.
peterjc
parents:
2
diff
changeset
|
66 v0.2.1 - Was missing a file for the functional tests. |
02c13ef1a669
Uploaded v0.2.1, fixed missing test file, more tests.
peterjc
parents:
2
diff
changeset
|
67 - Included testing of stdout messages. |
02c13ef1a669
Uploaded v0.2.1, fixed missing test file, more tests.
peterjc
parents:
2
diff
changeset
|
68 - Includes testing of failure modes. |
4 | 69 v0.2.2 - Reorder XML elements (internal change only). |
5
6b71ad5d43fb
v0.2.3 clarified help, internal cleanup of Python script
peterjc
parents:
4
diff
changeset
|
70 - Use ``format_source=...`` tag. |
4 | 71 - Planemo for Tool Shed upload (``.shed.yml``, internal change only). |
5
6b71ad5d43fb
v0.2.3 clarified help, internal cleanup of Python script
peterjc
parents:
4
diff
changeset
|
72 v0.2.3 - Do the Biopython imports at the script start (internal change only). |
6b71ad5d43fb
v0.2.3 clarified help, internal cleanup of Python script
peterjc
parents:
4
diff
changeset
|
73 - Clarify paired read example in help text. |
6
31f5701cd2e9
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
peterjc
parents:
5
diff
changeset
|
74 v0.2.4 - Depends on Biopython 1.67 via legacy Tool Shed package or bioconda. |
31f5701cd2e9
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
peterjc
parents:
5
diff
changeset
|
75 - Style changes to Python code (internal change only). |
0 | 76 ======= ====================================================================== |
77 | |
78 | |
79 Developers | |
80 ========== | |
81 | |
82 This script and related tools are being developed on this GitHub repository: | |
83 https://github.com/peterjc/pico_galaxy/tree/master/tools/sample_seqs | |
84 | |
4 | 85 For pushing a release to the test or main "Galaxy Tool Shed", use the following |
86 Planemo commands (which requires you have set your Tool Shed access details in | |
87 ``~/.planemo.yml`` and that you have access rights on the Tool Shed):: | |
88 | |
6
31f5701cd2e9
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
peterjc
parents:
5
diff
changeset
|
89 $ planemo shed_update -t testtoolshed --check_diff tools/sample_seqs/ |
4 | 90 ... |
91 | |
92 or:: | |
93 | |
6
31f5701cd2e9
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
peterjc
parents:
5
diff
changeset
|
94 $ planemo shed_update -t toolshed --check_diff tools/sample_seqs/ |
4 | 95 ... |
96 | |
97 To just build and check the tar ball, use:: | |
0 | 98 |
6
31f5701cd2e9
v0.2.4 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda.
peterjc
parents:
5
diff
changeset
|
99 $ planemo shed_upload --tar_only tools/sample_seqs/ |
4 | 100 ... |
101 $ tar -tzf shed_upload.tar.gz | |
102 test-data/MID4_GLZRM4E04_rnd30_frclip.pair_sample_N5.sff | |
103 test-data/MID4_GLZRM4E04_rnd30_frclip.sff | |
104 test-data/MID4_GLZRM4E04_rnd30_frclip.sample_C1.sff | |
105 test-data/MID4_GLZRM4E04_rnd30_frclip.sample_N5.sff | |
106 test-data/MID4_GLZRM4E04_rnd30_frclip.pair_sample_N5.sff | |
107 test-data/ecoli.fastq | |
108 test-data/ecoli.pair_sample_N100.fastq | |
109 test-data/ecoli.sample_C10.fastq | |
110 test-data/ecoli.sample_N100.fastq | |
111 test-data/get_orf_input.Suis_ORF.prot.fasta | |
112 test-data/get_orf_input.Suis_ORF.prot.pair_sample_C10.fasta | |
113 test-data/get_orf_input.Suis_ORF.prot.pair_sample_N100.fasta | |
114 test-data/get_orf_input.Suis_ORF.prot.sample_C10.fasta | |
115 test-data/get_orf_input.Suis_ORF.prot.sample_N100.fasta | |
0 | 116 tools/sample_seqs/README.rst |
117 tools/sample_seqs/sample_seqs.py | |
118 tools/sample_seqs/sample_seqs.xml | |
119 tools/sample_seqs/tool_dependencies.xml | |
120 | |
121 | |
122 Licence (MIT) | |
123 ============= | |
124 | |
125 Permission is hereby granted, free of charge, to any person obtaining a copy | |
126 of this software and associated documentation files (the "Software"), to deal | |
127 in the Software without restriction, including without limitation the rights | |
128 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |
129 copies of the Software, and to permit persons to whom the Software is | |
130 furnished to do so, subject to the following conditions: | |
131 | |
132 The above copyright notice and this permission notice shall be included in | |
133 all copies or substantial portions of the Software. | |
134 | |
135 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |
136 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |
137 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |
138 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |
139 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |
140 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | |
141 THE SOFTWARE. |