3
|
1 revcom_seq
|
|
2 ==========
|
|
3
|
|
4 `revcom_seq.pl` is a script to reverse complement (multi-)sequence files.
|
|
5
|
|
6 * [Synopsis](#synopsis)
|
|
7 * [Description](#description)
|
|
8 * [Usage](#usage)
|
|
9 * [Options](#options)
|
|
10 * [Output](#output)
|
|
11 * [Run environment](#run-environment)
|
|
12 * [Dependencies](#dependencies)
|
|
13 * [Author - contact](#author---contact)
|
|
14 * [Citation, installation, and license](#citation-installation-and-license)
|
|
15 * [Changelog](#changelog)
|
|
16
|
|
17 ## Synopsis
|
|
18
|
|
19 perl revcom_seq.pl seq-file.embl > seq-file_revcom.embl
|
|
20
|
|
21 **or**
|
|
22
|
|
23 perl cat_seq.pl multi-seq_file.embl | perl revcom_seq.pl -i embl > seq_file_cat_revcom.embl
|
|
24
|
|
25 ## Description
|
|
26
|
|
27 This script reverse complements (multi-)sequence files. The
|
|
28 features/annotations in RichSeq files (e.g. EMBL or GENBANK format)
|
|
29 will also be adapted accordingly. Use option **-o** to specify a
|
|
30 different output sequence format. Input files can be given directly via
|
|
31 *STDIN* or as a file. If *STDIN* is used, the input sequence file
|
|
32 format has to be given with option **-i**. Be careful to set the
|
|
33 correct input format.
|
|
34
|
|
35 ## Usage
|
|
36
|
|
37 perl revcom_seq.pl -o gbk seq-file.embl > seq-file_revcom.gbk
|
|
38
|
|
39 **or** reverse complement all sequence files in the current working directory:
|
|
40
|
|
41 for file in *.embl; do perl revcom_seq.pl -o fasta "$file" > "${file%.embl}"_revcom.fasta; done
|
|
42
|
|
43 ## Options
|
|
44
|
|
45 - **-h**, **-help**
|
|
46
|
|
47 Help (perldoc POD)
|
|
48
|
|
49 - **-o**=*str*, **-outformat**=*str*
|
|
50
|
|
51 Specify different sequence format for the output [fasta, embl, or gbk]
|
|
52
|
|
53 - **-i**=*str*, **-informat**=*str*
|
|
54
|
|
55 Specify the input sequence file format, only needed for *STDIN* input
|
|
56
|
|
57 - **-v**, **-version**
|
|
58
|
|
59 Print version number to *STDOUT*
|
|
60
|
|
61 ## Output
|
|
62
|
|
63 - *STDOUT*
|
|
64
|
|
65 The reverse complemented sequence file is printed to *STDOUT*.
|
|
66 Redirect or pipe into another tool as needed.
|
|
67
|
|
68 ## Run environment
|
|
69
|
|
70 The Perl script runs under Windows and UNIX flavors.
|
|
71
|
|
72 ## Dependencies
|
|
73
|
|
74 - [**BioPerl**](http://www.bioperl.org) (tested version 1.007001)
|
|
75
|
|
76 ## Author - contact
|
|
77
|
|
78 Andreas Leimbach (aleimba[at]gmx[dot]de; Microbial Genome Plasticity, Institute of Hygiene, University of Muenster)
|
|
79
|
|
80 ## Citation, installation, and license
|
|
81
|
|
82 For [citation](https://github.com/aleimba/bac-genomics-scripts#citation), [installation](https://github.com/aleimba/bac-genomics-scripts#installation-recommendations), and [license](https://github.com/aleimba/bac-genomics-scripts#license) information please see the repository main [*README.md*](https://github.com/aleimba/bac-genomics-scripts/blob/master/README.md).
|
|
83
|
|
84 ## Changelog
|
|
85
|
|
86 * v0.2 (2015-12-10)
|
|
87 * included a POD instead of a simple usage text
|
|
88 * included `pod2usage` with Pod::Usage
|
|
89 * included 'use autodie' pragma
|
|
90 * options with Getopt::Long
|
|
91 * output format now specified with option **-o**
|
|
92 * included version switch, **-v**
|
|
93 * allowed file and *STDIN* input, instead of only file; thus new option **-i** for input format
|
|
94 * output printed to *STDOUT* now, instead of output file
|
|
95 * fixed bug, that only first sequence in multi-sequence file is reverse complemented. Now all sequences in a multi-seq file are reverse complemented.
|
|
96 * v0.1 (2013-02-08)
|