comparison tools/venn_list/venn_list.xml @ 4:991342eca214 draft

Uploaded v0.0.8a, declare Biopython dependency via Tool Shed
author peterjc
date Wed, 29 Apr 2015 11:00:41 -0400
parents
children 26e35d5133a1
comparison
equal deleted inserted replaced
3:6aae6bc0802d 4:991342eca214
1 <tool id="venn_list" name="Venn Diagram" version="0.0.8">
2 <description>from lists</description>
3 <requirements>
4 <requirement type="python-module">rpy</requirement>
5 <requirement type="package" version="1.65">biopython</requirement>
6 <requirement type="python-module">Bio</requirement>
7 </requirements>
8 <stdio>
9 <!-- Anything other than zero is an error -->
10 <exit_code range="1:" />
11 <exit_code range=":-1" />
12 </stdio>
13 <command interpreter="python">
14 venn_list.py
15 #if $universe.type_select=="implicit":
16 - -
17 #else:
18 "$main" $main.ext
19 #end if
20 "$main_lab"
21 #for $s in $sets:
22 "$s.set" $s.set.ext "$s.lab"
23 #end for
24 $PDF
25 </command>
26 <inputs>
27 <param name="main_lab" size="30" type="text" value="Venn Diagram" label="Plot title"/>
28 <conditional name="universe">
29 <param name="type_select" type="select" label="Implicit or explicit full ID list?">
30 <option value="explicit">Explicit</option>
31 <option value="implicit">Implicit (use union of sets below)</option>
32 </param>
33 <when value="explicit">
34 <param name="main" type="data" format="tabular,fasta,fastq,sff" label="Full dataset (with all identifiers)" help="Tabular file (uses column one), FASTA, FASTQ or SFF file."/>
35 </when>
36 <when value="implicit"/>
37 </conditional>
38 <repeat name="sets" min="1" max="3" title="Sets">
39 <param name="set" type="data" format="tabular,fasta,fastq,sff" label="Members of set" help="Tabular file (uses column one), FASTA, FASTQ or SFF file."/>
40 <param name="lab" size="30" type="text" value="Group" label="Caption for set"/>
41 </repeat>
42 </inputs>
43 <outputs>
44 <data format="pdf" name="PDF" />
45 </outputs>
46 <tests>
47 <!-- Doesn't seem to work properly, manages to get two sets, both
48 with same FASTA file, but second with default "Group" label. -->
49 <test>
50 <param name="type_select" value="explicit"/>
51 <param name="main" value="venn_list.tabular" ftype="tabular"/>
52 <param name="main_lab" value="Some Proteins"/>
53 <param name="set" value="rhodopsin_proteins.fasta"/>
54 <param name="lab" value="Rhodopsins"/>
55 <output name="PDF" file="magic.pdf" ftype="pdf" compare="contains" />
56 </test>
57 <!-- Can't use more than one repeat value in tests (yet)
58 <test>
59 <param name="type_select" value="explicit"/>
60 <param name="main" value="venn_list.tabular" ftype="tabular"/>
61 <param name="main_lab" value="Some Proteins"/>
62 <param name="count" value="3"/>
63 <param name="set" value="rhodopsin_proteins.fasta"/>
64 <param name="lab" value="Rhodopsins"/>
65 <param name="set" value="four_human_proteins.fasta"/>
66 <param name="lab" value="Human"/>
67 <param name="set" value="blastp_four_human_vs_rhodopsin.tabular"/>
68 <param name="lab" value="Human vs Rhodopsin BLAST"/>
69 <output name="PDF" file="magic.pdf" ftype="pdf" compare="contains" />
70 </test>
71 -->
72 </tests>
73 <help>
74
75 .. class:: infomark
76
77 **TIP:** If your data is in tabular files, the identifier is assumed to be in column one.
78
79 **What it does**
80
81 Draws Venn Diagram for one, two or three sets (as a PDF file).
82
83 You must supply one, two or three sets of identifiers -- corresponding
84 to one, two or three circles on the Venn Diagram.
85
86 In general you should also give the full list of all the identifiers
87 explicitly. This is used to calculate the number of identifers outside
88 the circles (and check the identifiers in the other files match up).
89 The full list can be omitted by implicitly taking the union of the
90 category sets. In this case, the count outside the categories (circles)
91 will always be zero.
92
93 The identifiers can be taken from the first column of a tabular file
94 (e.g. query names in BLAST tabular output, or signal peptide predictions
95 after filtering, etc), or from a sequence file (FASTA, FASTQ, SFF).
96
97 For example, you may have a set of NGS reads (as a FASTA, FASTQ or SFF
98 file), and the results of several different read mappings (e.g. to
99 different references) as tabular files (filtered to have just the mapped
100 reads). You could then show the different mappings (and their overlaps)
101 as a Venn Diagram, and the outside count would be the unmapped reads.
102
103 **Citations**
104
105 The Venn Diagrams are drawn using Gordon Smyth's limma package from
106 R/Bioconductor, http://www.bioconductor.org/
107
108 The R library is called from Python via rpy, http://rpy.sourceforge.net/
109
110 If you use this Galaxy tool in work leading to a scientific publication please
111 cite:
112
113 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013).
114 Galaxy tools and workflows for sequence analysis with applications
115 in molecular plant pathology. PeerJ 1:e167
116 http://dx.doi.org/10.7717/peerj.167
117
118 This tool uses Biopython to read and write SFF files, so you may also wish to
119 cite the Biopython application note (and Galaxy too of course):
120
121 Cock et al 2009. Biopython: freely available Python tools for computational
122 molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.
123 http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878.
124
125 </help>
126 <citations>
127 <citation type="doi">10.7717/peerj.167</citation>
128 <citation type="doi">10.1093/bioinformatics/15.5.356</citation>
129 </citations>
130 </tool>