comparison removeFastaSubSequence.xml @ 0:9ec27561593e draft

planemo upload
author pravs
date Wed, 02 Aug 2017 18:09:53 -0400
parents
children d49328dfeceb
comparison
equal deleted inserted replaced
-1:000000000000 0:9ec27561593e
1
2 <tool id="removeFastaSubSequence" name="Remove Fasta Substring Sequence" version="1.0.0">
3 <description>Removes sequences that are subsequence in a reference Fasta File.</description>
4 <requirements>
5 <requirement type="package" version="1.70">biopython</requirement>
6 </requirements>
7 <command interpreter="python"><![CDATA[removeFastaSubSequence.py $ref_fastafile $query_fastafile $output]]></command>
8 <inputs>
9 <param name="ref_fastafile" type="data" format="fasta">
10 <label>Input Reference Fasta File</label>
11 </param>
12 <param name="query_fastafile" type="data" format="fasta">
13 <label>Input Query Fasta File</label>
14 </param>
15 </inputs>
16
17 <outputs>
18 <data format="fasta" name="output" label="uniqSeq_${query_fastafile.name.rsplit('.',1)[0]}.fasta" />
19 </outputs>
20
21 <tests>
22 <test>
23 <param name="ref_fastafile" value="test_ref.fasta" />
24 <param name="query_fastafile" value="test_query.fasta" />
25 <output name="output" file="uniqSeq_test_query.fasta">
26 <assert_contents>
27 <has_text text="ENSMUST00000193003" />
28 </assert_contents>
29 </output>
30 </test>
31 </tests>
32
33
34 <help>
35 This program removes the sequences from the query fasta file that are present as subsequence in a reference fasta file.
36
37 EXAMPLE:
38
39 ----
40
41 Ref sequences:
42
43 >reference_seq_1
44
45 TSLDKDHLELCCTLSLPFSWACSWVLVLRLSINGQLPRSRLWAAHCLWGVP
46
47 >reference_seq_2
48
49 RGLCISGLEKEVQVQSRQAEGPVHLWLRKGSTSAE
50
51 ----
52
53 Query Sequences:
54
55 >query_seq_1
56
57 TKTILNYAVLSPCLSPGHVLGC
58
59
60 >query_seq_2
61
62 LDKDHLELCCTLSLPFSWACSWVLVL
63
64
65 >query_seq_3
66
67 LWGVPRGLCISG
68
69 ----
70
71 Output Sequences:
72
73 >query_seq_1
74
75 TKTILNYAVLSPCLSPGHVLGC
76
77
78 >query_seq_3
79
80 LWGVPRGLCISG
81
82 ----
83
84 Output Sequence file will have only query_seq_1 and query_seq_3. query_seq_2 is removed because query_seq_2's sequence "LDKDHLELCCTLSLPFSWACSWVLVL" is
85 present as substring in reference_seq_1's sequence "TSLDKDHLELCCTLSLPFSWACSWVLVLRLSINGQLPRSRLWAAHCLWGVP".
86
87 </help>
88 </tool>