Galaxy | Tool Preview

PredictNLS (version 0.0.8)

What it does

This calls a Python re-implementation of the PredictNLS tool for prediction of nuclear localization signals (NLSs), which works by looking for matches to a known set of patterns (described using regular expressions).

The input is a FASTA file of protein sequences, and the output is tabular with these columns (multiple rows per protein):

Column Description
1 Sequence identifier
2 Start of NLS
3 NLS sequence
4 NLS pattern (regular expression)
5 Number of reference proteins with this NLS
6 Percentage of reference proteins with this NLS which are nuclear localized
7 Comma separated list of reference proteins
8 Comma separated list of reference proteins' localizations

If a sequence has no predicted NLS, then there is no line in the output file for it. This is a simplification of the text rich output from the command line tool, to give a tabular file suitable for use within Galaxy.

Information about potential DNA binding (shown in the original predictnls tool) is not given.

Localizations

The following abbreviations are used (derived from SWISS-PROT):

Abbr Localization
cyt Cytoplasm
pla Chloroplast
ret Eendoplasmic reticululm
ext Extracellular
gol Golgi
lys Lysosomal
mit Mitochondria
nuc Nuclear
oxi Peroxisom
vac Vacuolar
rip Periplasmic

References

If you use this Galaxy tool in work leading to a scientific publication please cite the following papers:

Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013). Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology. PeerJ 1:e167 http://dx.doi.org/10.7717/peerj.167

Murat Cokol, Rajesh Nair, and Burkhard Rost (2000). Finding nuclear localization signals. EMBO reports 1(5), 411–415 http://dx.doi.org/10.1093/embo-reports/kvd092

See also http://rostlab.org

This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/predictnls