Mercurial > repos > peterjc > predictnls
view tools/protein_analysis/predictnls.xml @ 0:6e26c5a48e9a draft
Uploaded v0.0.4, first public release.
author | peterjc |
---|---|
date | Wed, 20 Feb 2013 11:39:06 -0500 |
parents | |
children |
line wrap: on
line source
<tool id="predictnls" name="PredictNLS" version="0.0.4"> <description>Find nuclear localization signals (NLSs) in protein sequences</description> <command interpreter="python"> predictnls.py $fasta_file $tabular_file </command> <inputs> <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/> </inputs> <outputs> <data name="tabular_file" format="tabular" label="predictNLS results" /> </outputs> <tests> <test> <param name="fasta_file" value="four_human_proteins.fasta"/> <output name="tabular_file" file="four_human_proteins.predictnls.tabular"/> </test> </tests> <requirements> <requirement type="binary">predictnls</requirement> </requirements> <help> **What it does** This calls a Python re-implementation of the PredictNLS tool for prediction of nuclear localization signals (NLSs), which works by looking for matches to a known set of patterns (described using regular expressions). The input is a FASTA file of protein sequences, and the output is tabular with these columns (multiple rows per protein): ====== ========================================================================== Column Description ------ -------------------------------------------------------------------------- 1 Sequence identifier 2 Start of NLS 3 NLS sequence 4 NLS pattern (regular expression) 5 Number of reference proteins with this NLS 6 Percentage of reference proteins with this NLS which are nuclear localized 7 Comma separated list of reference proteins 8 Comma separated list of reference proteins' localizations ====== ========================================================================== If a sequence has no predicted NLS, then there is no line in the output file for it. This is a simplification of the text rich output from the command line tool, to give a tabular file suitable for use within Galaxy. Information about potential DNA binding (shown in the original predictnls tool) is not given. **Localizations** The following abbreviations are used (derived from SWISS-PROT): ==== ======================= Abbr Localization ---- ----------------------- cyt Cytoplasm pla Chloroplast ret Eendoplasmic reticululm ext Extracellular gol Golgi lys Lysosomal mit Mitochondria nuc Nuclear oxi Peroxisom vac Vacuolar rip Periplasmic ==== ======================= **References** Murat Cokol, Rajesh Nair, and Burkhard Rost. Finding nuclear localization signals. EMBO reports 1(5), 411–415, 2000 http://dx.doi.org/10.1093/embo-reports/kvd092 http://rostlab.org </help> </tool>