comparison Scan_IUPAC_output_matches_per_seq.xml @ 0:b67ea47730d3 draft

Version 1.0.1.
author pjbriggs
date Wed, 21 Mar 2018 06:39:41 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:b67ea47730d3
1 <?xml version="1.0" encoding="utf-8"?>
2 <tool id="fasta_scan_iupac_per_seq" name="IUPAC scan and output matches per seq" version="@VERSION@">
3 <description>Counts the matches to a given IUPAC</description>
4 <macros>
5 <import>motif_tools_macros.xml</import>
6 </macros>
7 <expand macro="requirements" />
8 <command><![CDATA[
9 perl $__tool_directory__/Scan_IUPAC_output_matches_per_seq.pl $iupac $fasta $output $strand
10 ]]></command>
11 <inputs>
12 <param name="iupac" type="text" label="IUPAC string" value="e.g. WGATAR" help="Enter an IUPAC string." size="20"/>
13 <param format="fasta" name="fasta" type="data" label="FASTA file" help="Select a FASTA file containing the sequences to be scanned."/>
14 <param name="strand" type="select" label="Select sequence strands to scan" help="Scan either both strands or only the forward strand.">
15 <option value="0">Scan both strands</option>
16 <option value="1">Only scan forward strand</option>
17 </param>
18 </inputs>
19 <outputs>
20 <data format="tabular" name="output" />
21 </outputs>
22 <tests>
23 <test>
24 <param name="iupac" value="WGATAR" />
25 <param name="fasta" value="phix.fa" />
26 <param name="strand" value="0" />
27 <output name="output" file="iupac_matches_per_seq.out" />
28 </test>
29 </tests>
30
31 <help>
32 .. class:: infomark
33
34 **What it does**
35
36 This tool will find all matches to a DNA pattern in the input DNA sequence, represented by an IUPAC string. The matches are non-overlapping, so searching with 'TTTT' in 'TTTTTTTT' will find two hits to the IUPAC. The output is a table that gives the seqname and the number of matches to the IUPAC per sequence. This version is useful if you want to get a count of IUPAC matches per sequence (e.g. a binding region) and paste the numbers back into a spreadsheet.
37
38 IUPAC = Nucleotide(s):
39
40 A = A
41
42 C = C
43
44 G = G
45
46 T = T
47
48 M = A/C
49
50 R = A/G
51
52 W = A/T
53
54 S = C/G
55
56 Y = C/T
57
58 K = G/T
59
60 V = A/C/G
61
62 H = A/C/T
63
64 D = A/G/T
65
66 B = C/G/T
67
68 N = A/C/G/T
69
70 ----
71
72 .. class:: infomark
73
74 **Options**
75
76 'IUPAC string' - can be entered as upper- or lower-case as the tool will force them to become upper-case, but will only accept the IUPAC codes listed above.
77
78 'Select sequence strands to scan' - Only scanning the forward strand if the input sequence is useful if the IUPAC is a palindrome (e.g. CANNTG).
79
80 ----
81
82 .. class:: infomark
83
84 **Credits**
85
86 This Galaxy tool has been developed within the Bioinformatics Core Facility at the University of Manchester. It runs the Scan_IUPAC_output_matches_per_seq.pl Perl script that was written by Ian Donaldson.
87
88 Please kindly acknowledge both this Galaxy tool and Scan_IUPAC_output_matches_per_seq.pl if you use it.
89 </help>
90
91 </tool>
92