Mercurial > repos > abossers > tophit_namefilter
diff TopHit_namefilter/TopHit_namefilter.xml @ 0:9f1fe290345e default tip
Migrated tool version 0.1.Alx from old tool shed archive to new tool shed repository
author | abossers |
---|---|
date | Tue, 07 Jun 2011 18:07:34 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/TopHit_namefilter/TopHit_namefilter.xml Tue Jun 07 18:07:34 2011 -0400 @@ -0,0 +1,112 @@ +<tool id="TopHit_namefilter" name="TopHit filter" version="0.1.Alx"> + <description>Simple filter to keep N occurrences of lines in a file</description> + <command interpreter="perl"> + TopHit_namefilter_galaxy.pl + $input + $column + "$splitter" + $hits + $output_file + <!-- 2>$logfile --> + </command> + <inputs> + <param name="input" type="data" format="tabular,txt" label="Input tabular or plain text file" /> + <param name="column" type="integer" size="4" value="1" label="Column number to use after the split!" /> + <param name="splitter" type="text" size="10" value="\t" label="Splitter character/code to use" help="See help below for advanced options and how to use {pipe}" > + <sanitizer> + <valid> + <add value="\"/> + <add value=">"/> + <add value="%"/> + <add value="|"/> + </valid> + </sanitizer> + </param> + <param name="hits" type="integer" size="4" value="1" label="Number of occurrences to keep" help="They will not be sorted!" /> + </inputs> + <outputs> + <data name="output_file" format="input" label="Filtered table/text" /> + </outputs> + <tests> + </tests> + <help> +**What it does** + +TopHit_namefilter is a SIMPLE filter to keep just the TOPHIT / first [N] occurrence(s) of some identifier +useful for keeping only the first N tophits in blast when multiple hits were returned (and you don't want to rerun the BLAST analysis). + +Please be aware that NO additional filtering or checking is done on for instance E values of BLAST hits. +Tophit = FIRST hit...not necessarily the best.. If multiple hits are selected to be returned +they will NOT be sorted (see below example of a number of 2 hits occurring somewhere else in the input +and therefore in the output file). + +**Comments/feedback** on the Perl script or GALAXY wrapper: alex.bossers@wur.nl + +----- + +**Note!** Beware the special use of splitters! Especially if you want to use special characters that have a "perl" split +meaning. They need to be escaped by a leading \\. + +Examples of splitters before filtering (end result will remain the ORIGINAL unsplit line!): + +:: + + Splitter Meaning Example line to split Split result for filtering only! + -------- ------------------------------- ----------------------- -------------------------------- + \t Single tab Foo<tab>Bar<tab>here ---> Foo Bar here + \| Single pipe Foo<tab>Bar|here ---> Foo<tab>Bar here + - Single dash Foo-Bar ---> Foo Bar + -|\| Combined splits on dash OR pipe Foo-Bar|here ---> Foo Bar here + + +----- + +**EXAMPLE** + +Parameters: Column = 1, **hits = 2** and splitter = \\t + +**Input** + +Any text/tabular file: + +:: + + Q3262-21 gi|71066702|gb|AE016828.2| tja..here something extra + Q3262-23 gi|71066702|gb|AE016828.2| okay + Q3262-24 gi|71066702|gb|AE016828.2| nothing there + Q3262-21 gi|71066702|gb|AE016828.2| enhier was zonder space :) + Q3262-26 gi|71066702|gb|AE016828.2| or still + Q3262-21 gi|71066702|gb|AE016828.2| + Q3262-21 gi|71066702|gb|AE016828.2| + Q3262-21 gi|71066702|gb|AE016828.2| + Q3262-21 gi|71066702|gb|AE016828.2| + Q3262-21 gi|145004|gb|M80806.1|COXTRANSPO + Q3262-21 gi|144996|gb|M20482.1|COXHSPAB + Q3262-21 gi|161761570|gb|CP000890.1| + Q3262-30 gi|161761570|gb|CP000890.1| + Q3262-21 gi|161761570|gb|CP000890.1| + Q3262-21 gi|161761570|gb|CP000890.1| + Q3262-21 gi|161761570|gb|CP000890.1| + + +**Outputs** + +:: + + Q3262-21 gi|71066702|gb|AE016828.2| tja..here something extra + Q3262-23 gi|71066702|gb|AE016828.2| okay + Q3262-21 gi|71066702|gb|AE016828.2| enhier was zonder space :) + Q3262-24 gi|71066702|gb|AE016828.2| nothing there + Q3262-26 gi|71066702|gb|AE016828.2| or still + Q3262-30 gi|161761570|gb|CP000890.1| + +----- + +Please acknowledge our work when you find it useful! + +| + + + </help> +</tool> +