annotate SMART/galaxy/CompareOverlappingSmallRef.xml @ 64:783e6ed4eb66

Minor bug correction. Added casts to str in Galaxy XML files. Also closed the writer in the Python script "changeTagName."
author m-zytnicki
date Mon, 19 Oct 2015 14:16:44 +0200
parents 90f4b29d884f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
1 <tool id="CompareOverlappingSmallRef" name="compare overlapping small reference">
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
2 <description>Provide the queries that overlap with a reference, when the query data set is small.</description>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
3 <requirements>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
4 <requirement type="set_environment">PYTHONPATH</requirement>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
5 </requirements>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
6 <command interpreter="python">
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
7 ../Java/Python/CompareOverlappingSmallRef.py -i $formatType.inputFileName1 -f $formatType.FormatInputFileName1 -j $formatType2.inputFileName2 -g $formatType2.FormatInputFileName2 -o $outputFileGff $InvertMatch $NotOverlapping $OptionInclusionQuery $OptionInclusionRef -m $OptionMinOverlap
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
8 #if $OptionDistance.Dist == 'Yes':
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
9 -d $OptionDistance.distance
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
10 #end if
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
11 #if $OptionPcOverlapQuery.present == 'Yes':
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
12 -p $OptionPcOverlapQuery.minOverlap
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
13 #end if
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
14 #if $OptionPcOverlapRef.present == 'Yes':
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
15 -P $OptionPcOverlapRef.minOverlap
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
16 #end if
64
783e6ed4eb66 Minor bug correction.
m-zytnicki
parents: 60
diff changeset
17 #if str($OptionCA) == 'Collinear':
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
18 -c
64
783e6ed4eb66 Minor bug correction.
m-zytnicki
parents: 60
diff changeset
19 #elif str($OptionCA) == 'AntiSense':
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
20 -a
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
21 #end if
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
22 </command>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
23
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
24 <inputs>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
25 <conditional name="formatType">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
26 <param name="FormatInputFileName1" type="select" label="Input Query File Format">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
27 <option value="bed">bed</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
28 <option value="gff">gff</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
29 <option value="gff2">gff2</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
30 <option value="gff3">gff3</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
31 <option value="sam">sam</option>
56
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
32 <option value="bam">bam</option>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
33 <option value="gtf">gtf</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
34 </param>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
35 <when value="bed">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
36 <param name="inputFileName1" format="bed" type="data" label="Input File 1"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
37 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
38 <when value="gff">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
39 <param name="inputFileName1" format="gff" type="data" label="Input File 1"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
40 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
41 <when value="gff2">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
42 <param name="inputFileName1" format="gff2" type="data" label="Input File 1"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
43 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
44 <when value="gff3">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
45 <param name="inputFileName1" format="gff3" type="data" label="Input File 1"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
46 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
47 <when value="sam">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
48 <param name="inputFileName1" format="sam" type="data" label="Input File 1"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
49 </when>
56
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
50 <when value="bam">
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
51 <param name="inputFileName1" format="bam" type="data" label="Input File 1"/>
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
52 </when>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
53 <when value="gtf">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
54 <param name="inputFileName1" format="gtf" type="data" label="Input File 1"/>
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
55 </when>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
56 </conditional>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
57
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
58 <conditional name="formatType2">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
59 <param name="FormatInputFileName2" type="select" label="Input Reference File Format">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
60 <option value="bed">bed</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
61 <option value="gff">gff</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
62 <option value="gff2">gff2</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
63 <option value="gff3">gff3</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
64 <option value="sam">sam</option>
56
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
65 <option value="bam">bam</option>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
66 <option value="gtf">gtf</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
67 </param>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
68 <when value="bed">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
69 <param name="inputFileName2" format="bed" type="data" label="Input File 2"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
70 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
71 <when value="gff">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
72 <param name="inputFileName2" format="gff" type="data" label="Input File 2"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
73 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
74 <when value="gff2">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
75 <param name="inputFileName2" format="gff2" type="data" label="Input File 2"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
76 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
77 <when value="gff3">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
78 <param name="inputFileName2" format="gff3" type="data" label="Input File 2"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
79 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
80 <when value="sam">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
81 <param name="inputFileName2" format="sam" type="data" label="Input File 2"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
82 </when>
56
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
83 <when value="bam">
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
84 <param name="inputFileName2" format="bam" type="data" label="Input File 2"/>
97aa2e42bfdf Uploaded
m-zytnicki
parents: 38
diff changeset
85 </when>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
86 <when value="gtf">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
87 <param name="inputFileName2" format="gtf" type="data" label="Input File 2"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
88 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
89 </conditional>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
90 <conditional name="OptionDistance">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
91 <param name="Dist" type="select" label="Maximum Distance between two reads">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
92 <option value="Yes">Yes</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
93 <option value="No" selected="true">No</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
94 </param>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
95 <when value="Yes">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
96 <param name="distance" type="integer" value="0"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
97 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
98 <when value="No">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
99 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
100 </conditional>
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
101 <param name="OptionMinOverlap" type="integer" value="1" label="Min. # of overlapping nt. to declare an overlap."/>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
102 <conditional name="OptionPcOverlapQuery">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
103 <param name="present" type="select" label="N% of the query must overlap">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
104 <option value="Yes">Yes</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
105 <option value="No" selected="true">No</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
106 </param>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
107 <when value="Yes">
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
108 <param name="minOverlap" type="integer" value="100"/>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
109 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
110 <when value="No">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
111 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
112 </conditional>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
113 <conditional name="OptionPcOverlapRef">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
114 <param name="present" type="select" label="N% of the reference must overlap">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
115 <option value="Yes">Yes</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
116 <option value="No" selected="true">No</option>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
117 </param>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
118 <when value="Yes">
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
119 <param name="minOverlap" type="integer" value="100"/>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
120 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
121 <when value="No">
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
122 </when>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
123 </conditional>
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
124 <param name="OptionInclusionQuery" type="boolean" truevalue="-k" falsevalue="" checked="false" label="The query must be nested in a reference"/>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
125 <param name="OptionInclusionRef" type="boolean" truevalue="-K" falsevalue="" checked="false" label="The reference must be nested in a query"/>
60
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
126 <param name="OptionCA" type="select" label="Collinear or anti-sense only">
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
127 <option value="Collinear">Collinear</option>
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
128 <option value="AntiSense">AntiSense</option>
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
129 <option value="All" selected="true">All</option>
90f4b29d884f Uploaded
m-zytnicki
parents: 56
diff changeset
130 </param>
38
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
131 <param name="InvertMatch" type="boolean" truevalue="-x" falsevalue="" checked="false" label="Invert match: the output file will contain all query elements which do NOT overlap"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
132 <param name="NotOverlapping" type="boolean" truevalue="-O" falsevalue="" checked="false" label="Also report the query data which do not overlap, with the nbOverlaps tag set to 0."/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
133 </inputs>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
134
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
135 <outputs>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
136 <data name="outputFileGff" format="gff3"/>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
137 </outputs>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
138
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
139 <help>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
140 This script may be the most important one. It basically compares two sets of transcripts and keeps those from the first set which overlap with the second one. The first set is considered as the query set (basically, your data) and the second one is the reference set (RefSeq data, for example).
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
141
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
142 It is vital to understand that it will output the elements of the first file which overlap with the elements of the second one.
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
143
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
144 Various modifiers are also available:
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
145
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
146 -Invert selection (report those which do not overlap).
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
147
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
148 -Restrict to colinear / anti-sense overlapping data.
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
149
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
150 -Keep the query data even if they do not strictly overlap with the reference data, but are located not further away than *n* nucleotide from some reference data.
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
151
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
152 -Keep the query data with are strictly included into reference data, meaning that a query transcript such that at least 1 nucleotide does not overlap with reference data will not be presented as a solution.
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
153
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
154 The mechanism of shrinking and extending is also useful to make a fine grain comparison. For example, if you want to keep those such that the TSS is overlapping the reference set, you just shrink the query set to 1 nucleotide. Now, if you want to keep those which are overlapping you data or located 2kb downstream of it, just extend the query data in the downstream direction, and you will have what you want. You can also extend in the opposite direction to get the possible transcript factor sites which are upstream.
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
155
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
156 Some option reverses the selection. Put in other words, it performs the comparison as usual, and outputs all those query data which do not overlap.
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
157 </help>
2c0c0a89fad7 Uploaded
m-zytnicki
parents:
diff changeset
158 </tool>