annotate COG/bac-genomics-scripts/revcom_seq/README.md @ 3:e42d30da7a74 draft

Uploaded
author dereeper
date Thu, 30 May 2024 11:52:25 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
1 revcom_seq
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
2 ==========
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
3
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
4 `revcom_seq.pl` is a script to reverse complement (multi-)sequence files.
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
5
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
6 * [Synopsis](#synopsis)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
7 * [Description](#description)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
8 * [Usage](#usage)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
9 * [Options](#options)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
10 * [Output](#output)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
11 * [Run environment](#run-environment)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
12 * [Dependencies](#dependencies)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
13 * [Author - contact](#author---contact)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
14 * [Citation, installation, and license](#citation-installation-and-license)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
15 * [Changelog](#changelog)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
16
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
17 ## Synopsis
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
18
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
19 perl revcom_seq.pl seq-file.embl > seq-file_revcom.embl
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
20
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
21 **or**
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
22
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
23 perl cat_seq.pl multi-seq_file.embl | perl revcom_seq.pl -i embl > seq_file_cat_revcom.embl
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
24
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
25 ## Description
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
26
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
27 This script reverse complements (multi-)sequence files. The
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
28 features/annotations in RichSeq files (e.g. EMBL or GENBANK format)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
29 will also be adapted accordingly. Use option **-o** to specify a
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
30 different output sequence format. Input files can be given directly via
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
31 *STDIN* or as a file. If *STDIN* is used, the input sequence file
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
32 format has to be given with option **-i**. Be careful to set the
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
33 correct input format.
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
34
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
35 ## Usage
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
36
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
37 perl revcom_seq.pl -o gbk seq-file.embl > seq-file_revcom.gbk
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
38
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
39 **or** reverse complement all sequence files in the current working directory:
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
40
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
41 for file in *.embl; do perl revcom_seq.pl -o fasta "$file" > "${file%.embl}"_revcom.fasta; done
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
42
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
43 ## Options
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
44
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
45 - **-h**, **-help**
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
46
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
47 Help (perldoc POD)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
48
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
49 - **-o**=*str*, **-outformat**=*str*
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
50
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
51 Specify different sequence format for the output [fasta, embl, or gbk]
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
52
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
53 - **-i**=*str*, **-informat**=*str*
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
54
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
55 Specify the input sequence file format, only needed for *STDIN* input
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
56
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
57 - **-v**, **-version**
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
58
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
59 Print version number to *STDOUT*
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
60
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
61 ## Output
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
62
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
63 - *STDOUT*
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
64
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
65 The reverse complemented sequence file is printed to *STDOUT*.
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
66 Redirect or pipe into another tool as needed.
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
67
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
68 ## Run environment
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
69
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
70 The Perl script runs under Windows and UNIX flavors.
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
71
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
72 ## Dependencies
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
73
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
74 - [**BioPerl**](http://www.bioperl.org) (tested version 1.007001)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
75
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
76 ## Author - contact
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
77
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
78 Andreas Leimbach (aleimba[at]gmx[dot]de; Microbial Genome Plasticity, Institute of Hygiene, University of Muenster)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
79
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
80 ## Citation, installation, and license
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
81
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
82 For [citation](https://github.com/aleimba/bac-genomics-scripts#citation), [installation](https://github.com/aleimba/bac-genomics-scripts#installation-recommendations), and [license](https://github.com/aleimba/bac-genomics-scripts#license) information please see the repository main [*README.md*](https://github.com/aleimba/bac-genomics-scripts/blob/master/README.md).
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
83
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
84 ## Changelog
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
85
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
86 * v0.2 (2015-12-10)
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
87 * included a POD instead of a simple usage text
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
88 * included `pod2usage` with Pod::Usage
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
89 * included 'use autodie' pragma
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
90 * options with Getopt::Long
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
91 * output format now specified with option **-o**
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
92 * included version switch, **-v**
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
93 * allowed file and *STDIN* input, instead of only file; thus new option **-i** for input format
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
94 * output printed to *STDOUT* now, instead of output file
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
95 * fixed bug, that only first sequence in multi-sequence file is reverse complemented. Now all sequences in a multi-seq file are reverse complemented.
e42d30da7a74 Uploaded
dereeper
parents:
diff changeset
96 * v0.1 (2013-02-08)