3
|
1 seq_format-converter
|
|
2 ====================
|
|
3
|
|
4 A script to convert a sequence file to another format.
|
|
5
|
|
6 ## Synopsis
|
|
7
|
|
8 perl seq_format-converter.pl -i seq_file.gbk -f gbk -o embl
|
|
9
|
|
10 ## Description
|
|
11
|
|
12 This script converts a (multi-)sequence file of a specific format to a differently formatted output file. The most common sequence formats are: **embl**, **fasta**, and **gbk** (genbank).
|
|
13
|
|
14 Since sequence formats change from time to time, BioPerl is not always up to date. For all available BioPerl sequence formats see: http://www.bioperl.org/wiki/HOWTO:SeqIO#Formats. **Warning**: The *bioperl-ext* package and the *io_lib* library from the **Staden** package (http://staden.sourceforge.net/) need to be installed in order to read the scf, abi, alf, pln, exp, ctf, ztr formats.
|
|
15
|
|
16 ## Usage
|
|
17
|
|
18 perl seq_format-converter.pl -i seq_file -f in_format -o out_format
|
|
19
|
|
20 ### UNIX loop to reformat all sequence files in the current working directory
|
|
21
|
|
22 for i in *.[embl|gbk]; do perl seq_format-converter.pl -i $i -f [embl|gbk] -o [embl|fasta|gbk]; done
|
|
23
|
|
24 ## Options for *seq_format-converter.pl*
|
|
25
|
|
26 ### Mandatory options
|
|
27
|
|
28 * -i, -input
|
|
29
|
|
30 Input sequence file
|
|
31
|
|
32 * -f, -format
|
|
33
|
|
34 Input sequence format (e.g. 'embl' or 'gbk)
|
|
35
|
|
36 * -o, -out_format
|
|
37
|
|
38 Output sequence format (e.g. 'embl', 'fasta' or 'gbk)
|
|
39
|
|
40 ### Optional options
|
|
41
|
|
42 * -h, -help
|
|
43
|
|
44 Print usage
|
|
45
|
|
46 * -v, -version
|
|
47
|
|
48 Print version number
|
|
49
|
|
50 ## Output
|
|
51
|
|
52 * seq_file.[embl|fasta|gbk]
|
|
53
|
|
54 Output sequence file in the specified format
|
|
55
|
|
56 ## Run environment
|
|
57
|
|
58 The Perl script runs under Windows and UNIX flavors.
|
|
59
|
|
60 ## Dependencies (not in the core Perl modules)
|
|
61
|
|
62 * BioPerl (tested with version 1.006901)
|
|
63
|
|
64 ## Author/contact
|
|
65
|
|
66 Andreas Leimbach (aleimba[at]gmx[dot]de; Microbial Genome Plasticity, Institute of Hygiene, University of Muenster)
|
|
67
|
|
68 ## Citation, installation, and license
|
|
69
|
|
70 For [citation](https://github.com/aleimba/bac-genomics-scripts#citation), [installation](https://github.com/aleimba/bac-genomics-scripts#installation-recommendations), and [license](https://github.com/aleimba/bac-genomics-scripts#license) information please see the repository main [*README.md*](https://github.com/aleimba/bac-genomics-scripts/blob/master/README.md).
|
|
71
|
|
72 ## Changelog
|
|
73
|
|
74 * v0.2 (03.02.2014)
|
|
75 - allow short 'gbk' format instead of 'genbank'
|
|
76 - also short 'gbk' file-extension for output file
|
|
77 - included 'use autodie'
|
|
78 - usage as HERE document
|
|
79 - options with Getopt::Long
|
|
80 - version switch
|
|
81 * v0.1 (10.11.2011)
|