comparison fasta_affixer.xml @ 0:a4cd8608ef6b draft

Uploaded
author petr-novak
date Mon, 01 Apr 2019 07:56:36 -0400
parents
children c2c69c6090f0
comparison
equal deleted inserted replaced
-1:000000000000 0:a4cd8608ef6b
1 <tool id="fasta_affixer" name="FASTA read name affixer" version="1.0.0">
2 <description> Tool appending suffix and prefix to sequences names </description>
3 <command interpreter="python3">
4 fasta_affixer.py -f $input -p "$prefix" -s "$suffix" -n $nspace -o $output
5 </command>
6
7 <inputs>
8 <param format="fasta" type="data" name="input" label="Choose your fasta file" />
9 <param name="prefix" type="text" size="10" value="" label="Prefix" help="Enter prefix which will be added to all sequences names" />
10 <param name="suffix" type="text" size="10" value="" label="Suffix" help="Enter suffix which will be added to all sequences names"/>
11 <param name="nspace" type="integer" size="10" value="0" min="0" max="1000" label="Number of spaces in name to ignore" help="Sequence name is a string before the first space. If you want name to include spaces in name, enter positive integer. All other characters beyond ignored spaces are omitted"/>
12 </inputs>
13
14
15 <outputs>
16 <data format="fasta" name="output" label="fasta dataset ${input.hid} with modified sequence names" />
17 </outputs>
18
19 <tests>
20 <test>
21 <param name="input" value="single_output.fasta" />
22 <param name="prefix" value="TEST" />
23 <param name="suffux" value="OK"/>
24 <param name="nspace" value="0" />
25 <output name="output" value="prefix_suffix.fasta" />
26 </test>
27 </tests>
28 <help>
29 **What is does**
30
31 Tool for appending prefix and suffix to sequences names in fasta formated sequences. This tool is useful
32 if you want to do comparative analysis with RepeatExplorer and need to
33 append sample codes to sequence identifiers
34
35 **Example**
36 The following fasta file:
37
38 ::
39
40 >123454
41 acgtactgactagccatgacg
42 >234235
43 acgtactgactagccatgacg
44
45 is renamed to:
46
47 ::
48
49 >prefix123454suffix
50 acgtactgactagccatgacg
51 >prefix234235suffix
52 acgtactgactagccatgacg
53
54
55 By default, anything after spaces is
56 excluded from sequences name. In example sequence:
57
58 ::
59
60 >SRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1
61 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC
62 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIHIIIIIFIIIIIIHDHBBIHFIHIIBHHDDHIFHIHIIIHIHGGDFDEI@EGEGFGFEFB@ECG
63
64 when **Number of spaces in name to ignore** is set to 0 (default) the output will be:
65
66 ::
67
68 >prefixSRR352150.23846180suffix
69 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC
70
71
72 If you want to keep spaces the setting **Number of spaces in name to ignore** to 1 will yield
73
74 ::
75
76 >prefixSRR352150.23846180 HWUSI-EAS1786:7:119:15910:19280/1suffix
77 CTGGATTCTATACCTTTGGCAACTACTTCTTGGTTGATCAGGAAATTAACACTAGTAGTTTAGGCAATTTGGAATGGTGCCAAAGATGTATAGAACTTTC
78
79
80 </help>
81 </tool>