Mercurial > repos > abossers > tophit_namefilter
comparison TopHit_namefilter/TopHit_namefilter.xml @ 0:9f1fe290345e default tip
Migrated tool version 0.1.Alx from old tool shed archive to new tool shed repository
author | abossers |
---|---|
date | Tue, 07 Jun 2011 18:07:34 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:9f1fe290345e |
---|---|
1 <tool id="TopHit_namefilter" name="TopHit filter" version="0.1.Alx"> | |
2 <description>Simple filter to keep N occurrences of lines in a file</description> | |
3 <command interpreter="perl"> | |
4 TopHit_namefilter_galaxy.pl | |
5 $input | |
6 $column | |
7 "$splitter" | |
8 $hits | |
9 $output_file | |
10 <!-- 2>$logfile --> | |
11 </command> | |
12 <inputs> | |
13 <param name="input" type="data" format="tabular,txt" label="Input tabular or plain text file" /> | |
14 <param name="column" type="integer" size="4" value="1" label="Column number to use after the split!" /> | |
15 <param name="splitter" type="text" size="10" value="\t" label="Splitter character/code to use" help="See help below for advanced options and how to use {pipe}" > | |
16 <sanitizer> | |
17 <valid> | |
18 <add value="\"/> | |
19 <add value=">"/> | |
20 <add value="%"/> | |
21 <add value="|"/> | |
22 </valid> | |
23 </sanitizer> | |
24 </param> | |
25 <param name="hits" type="integer" size="4" value="1" label="Number of occurrences to keep" help="They will not be sorted!" /> | |
26 </inputs> | |
27 <outputs> | |
28 <data name="output_file" format="input" label="Filtered table/text" /> | |
29 </outputs> | |
30 <tests> | |
31 </tests> | |
32 <help> | |
33 **What it does** | |
34 | |
35 TopHit_namefilter is a SIMPLE filter to keep just the TOPHIT / first [N] occurrence(s) of some identifier | |
36 useful for keeping only the first N tophits in blast when multiple hits were returned (and you don't want to rerun the BLAST analysis). | |
37 | |
38 Please be aware that NO additional filtering or checking is done on for instance E values of BLAST hits. | |
39 Tophit = FIRST hit...not necessarily the best.. If multiple hits are selected to be returned | |
40 they will NOT be sorted (see below example of a number of 2 hits occurring somewhere else in the input | |
41 and therefore in the output file). | |
42 | |
43 **Comments/feedback** on the Perl script or GALAXY wrapper: alex.bossers@wur.nl | |
44 | |
45 ----- | |
46 | |
47 **Note!** Beware the special use of splitters! Especially if you want to use special characters that have a "perl" split | |
48 meaning. They need to be escaped by a leading \\. | |
49 | |
50 Examples of splitters before filtering (end result will remain the ORIGINAL unsplit line!): | |
51 | |
52 :: | |
53 | |
54 Splitter Meaning Example line to split Split result for filtering only! | |
55 -------- ------------------------------- ----------------------- -------------------------------- | |
56 \t Single tab Foo<tab>Bar<tab>here ---> Foo Bar here | |
57 \| Single pipe Foo<tab>Bar|here ---> Foo<tab>Bar here | |
58 - Single dash Foo-Bar ---> Foo Bar | |
59 -|\| Combined splits on dash OR pipe Foo-Bar|here ---> Foo Bar here | |
60 | |
61 | |
62 ----- | |
63 | |
64 **EXAMPLE** | |
65 | |
66 Parameters: Column = 1, **hits = 2** and splitter = \\t | |
67 | |
68 **Input** | |
69 | |
70 Any text/tabular file: | |
71 | |
72 :: | |
73 | |
74 Q3262-21 gi|71066702|gb|AE016828.2| tja..here something extra | |
75 Q3262-23 gi|71066702|gb|AE016828.2| okay | |
76 Q3262-24 gi|71066702|gb|AE016828.2| nothing there | |
77 Q3262-21 gi|71066702|gb|AE016828.2| enhier was zonder space :) | |
78 Q3262-26 gi|71066702|gb|AE016828.2| or still | |
79 Q3262-21 gi|71066702|gb|AE016828.2| | |
80 Q3262-21 gi|71066702|gb|AE016828.2| | |
81 Q3262-21 gi|71066702|gb|AE016828.2| | |
82 Q3262-21 gi|71066702|gb|AE016828.2| | |
83 Q3262-21 gi|145004|gb|M80806.1|COXTRANSPO | |
84 Q3262-21 gi|144996|gb|M20482.1|COXHSPAB | |
85 Q3262-21 gi|161761570|gb|CP000890.1| | |
86 Q3262-30 gi|161761570|gb|CP000890.1| | |
87 Q3262-21 gi|161761570|gb|CP000890.1| | |
88 Q3262-21 gi|161761570|gb|CP000890.1| | |
89 Q3262-21 gi|161761570|gb|CP000890.1| | |
90 | |
91 | |
92 **Outputs** | |
93 | |
94 :: | |
95 | |
96 Q3262-21 gi|71066702|gb|AE016828.2| tja..here something extra | |
97 Q3262-23 gi|71066702|gb|AE016828.2| okay | |
98 Q3262-21 gi|71066702|gb|AE016828.2| enhier was zonder space :) | |
99 Q3262-24 gi|71066702|gb|AE016828.2| nothing there | |
100 Q3262-26 gi|71066702|gb|AE016828.2| or still | |
101 Q3262-30 gi|161761570|gb|CP000890.1| | |
102 | |
103 ----- | |
104 | |
105 Please acknowledge our work when you find it useful! | |
106 | |
107 | | |
108 | |
109 | |
110 </help> | |
111 </tool> | |
112 |