Mercurial > repos > pravs > remove_fasta_subsequences
comparison removeFastaSubSequence.xml @ 0:9ec27561593e draft
planemo upload
author | pravs |
---|---|
date | Wed, 02 Aug 2017 18:09:53 -0400 |
parents | |
children | d49328dfeceb |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:9ec27561593e |
---|---|
1 | |
2 <tool id="removeFastaSubSequence" name="Remove Fasta Substring Sequence" version="1.0.0"> | |
3 <description>Removes sequences that are subsequence in a reference Fasta File.</description> | |
4 <requirements> | |
5 <requirement type="package" version="1.70">biopython</requirement> | |
6 </requirements> | |
7 <command interpreter="python"><![CDATA[removeFastaSubSequence.py $ref_fastafile $query_fastafile $output]]></command> | |
8 <inputs> | |
9 <param name="ref_fastafile" type="data" format="fasta"> | |
10 <label>Input Reference Fasta File</label> | |
11 </param> | |
12 <param name="query_fastafile" type="data" format="fasta"> | |
13 <label>Input Query Fasta File</label> | |
14 </param> | |
15 </inputs> | |
16 | |
17 <outputs> | |
18 <data format="fasta" name="output" label="uniqSeq_${query_fastafile.name.rsplit('.',1)[0]}.fasta" /> | |
19 </outputs> | |
20 | |
21 <tests> | |
22 <test> | |
23 <param name="ref_fastafile" value="test_ref.fasta" /> | |
24 <param name="query_fastafile" value="test_query.fasta" /> | |
25 <output name="output" file="uniqSeq_test_query.fasta"> | |
26 <assert_contents> | |
27 <has_text text="ENSMUST00000193003" /> | |
28 </assert_contents> | |
29 </output> | |
30 </test> | |
31 </tests> | |
32 | |
33 | |
34 <help> | |
35 This program removes the sequences from the query fasta file that are present as subsequence in a reference fasta file. | |
36 | |
37 EXAMPLE: | |
38 | |
39 ---- | |
40 | |
41 Ref sequences: | |
42 | |
43 >reference_seq_1 | |
44 | |
45 TSLDKDHLELCCTLSLPFSWACSWVLVLRLSINGQLPRSRLWAAHCLWGVP | |
46 | |
47 >reference_seq_2 | |
48 | |
49 RGLCISGLEKEVQVQSRQAEGPVHLWLRKGSTSAE | |
50 | |
51 ---- | |
52 | |
53 Query Sequences: | |
54 | |
55 >query_seq_1 | |
56 | |
57 TKTILNYAVLSPCLSPGHVLGC | |
58 | |
59 | |
60 >query_seq_2 | |
61 | |
62 LDKDHLELCCTLSLPFSWACSWVLVL | |
63 | |
64 | |
65 >query_seq_3 | |
66 | |
67 LWGVPRGLCISG | |
68 | |
69 ---- | |
70 | |
71 Output Sequences: | |
72 | |
73 >query_seq_1 | |
74 | |
75 TKTILNYAVLSPCLSPGHVLGC | |
76 | |
77 | |
78 >query_seq_3 | |
79 | |
80 LWGVPRGLCISG | |
81 | |
82 ---- | |
83 | |
84 Output Sequence file will have only query_seq_1 and query_seq_3. query_seq_2 is removed because query_seq_2's sequence "LDKDHLELCCTLSLPFSWACSWVLVL" is | |
85 present as substring in reference_seq_1's sequence "TSLDKDHLELCCTLSLPFSWACSWVLVLRLSINGQLPRSRLWAAHCLWGVP". | |
86 | |
87 </help> | |
88 </tool> |