Mercurial > repos > peterjc > tmhmm_and_signalp
comparison tools/protein_analysis/tmhmm2.xml @ 11:99b82a2b1272 draft
Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
author | peterjc |
---|---|
date | Wed, 03 Apr 2013 10:49:10 -0400 |
parents | e52220a9ddad |
children | dc958c2a963a |
comparison
equal
deleted
inserted
replaced
10:09ff180d1615 | 11:99b82a2b1272 |
---|---|
1 <tool id="tmhmm2" name="TMHMM 2.0" version="0.0.9"> | 1 <tool id="tmhmm2" name="TMHMM 2.0" version="0.0.10"> |
2 <description>Find transmembrane domains in protein sequences</description> | 2 <description>Find transmembrane domains in protein sequences</description> |
3 <!-- If job splitting is enabled, break up the query file into parts --> | 3 <!-- If job splitting is enabled, break up the query file into parts --> |
4 <!-- Using 2000 chunks meaning 4 threads doing 500 each is ideal --> | 4 <!-- Using 2000 chunks meaning 4 threads doing 500 each is ideal --> |
5 <parallelism method="basic" split_inputs="fasta_file" split_mode="to_size" split_size="2000" merge_outputs="tabular_file"></parallelism> | 5 <parallelism method="basic" split_inputs="fasta_file" split_mode="to_size" split_size="2000" merge_outputs="tabular_file"></parallelism> |
6 <command interpreter="python"> | 6 <command interpreter="python"> |
45 | 45 |
46 This calls the TMHMM v2.0 tool for prediction of transmembrane (TM) helices in proteins using a hidden Markov model (HMM). | 46 This calls the TMHMM v2.0 tool for prediction of transmembrane (TM) helices in proteins using a hidden Markov model (HMM). |
47 | 47 |
48 The input is a FASTA file of protein sequences, and the output is tabular with six columns (one row per protein): | 48 The input is a FASTA file of protein sequences, and the output is tabular with six columns (one row per protein): |
49 | 49 |
50 1. Sequence identifier | 50 ====== ===================================================================================== |
51 2. Sequence length | 51 Column Description |
52 3. Expected number of amino acids in TM helices (ExpAA). If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide). | 52 ------ ------------------------------------------------------------------------------------- |
53 4. Expected number of amino acids in TM helices in the first 60 amino acids of the protein (Exp60). If this number more than a few, be aware that a predicted transmembrane helix in the N-term could be a signal peptide. | 53 1 Sequence identifier |
54 5. Number of transmembrane helices predicted by N-best. | 54 2 Sequence length |
55 6. Topology predicted by N-best (encoded as a strip using o for output and i for inside) | 55 3 Expected number of amino acids in TM helices (ExpAA). If this number is larger than |
56 18 it is very likely to be a transmembrane protein (OR have a signal peptide). | |
57 4 Expected number of amino acids in TM helices in the first 60 amino acids of the | |
58 protein (Exp60). If this number more than a few, be aware that a predicted | |
59 transmembrane helix in the N-term could be a signal peptide. | |
60 5 Number of transmembrane helices predicted by N-best. | |
61 6 Topology predicted by N-best (encoded as a strip using o for output and i for inside) | |
62 ====== ===================================================================================== | |
56 | 63 |
57 Predicted TM segments in the n-terminal region sometimes turn out to be signal peptides. | 64 Predicted TM segments in the n-terminal region sometimes turn out to be signal peptides. |
58 | 65 |
59 One of the most common mistakes by the program is to reverse the direction of proteins with one TM segment (i.e. mixing up which end of the protein is outside and inside the membrane). | 66 One of the most common mistakes by the program is to reverse the direction of proteins with one TM segment (i.e. mixing up which end of the protein is outside and inside the membrane). |
60 | 67 |
61 Do not use the program to predict whether a non-membrane protein is cytoplasmic or not. | 68 Do not use the program to predict whether a non-membrane protein is cytoplasmic or not. |
69 | |
62 | 70 |
63 **Notes** | 71 **Notes** |
64 | 72 |
65 The short format output from TMHMM v2.0 looks like this (six columns tab separated, shown here as a table): | 73 The short format output from TMHMM v2.0 looks like this (six columns tab separated, shown here as a table): |
66 | 74 |
79 gi|4959044|gb|AAD34209.1|AF069992_1 600 0.00 0.00 0 o | 87 gi|4959044|gb|AAD34209.1|AF069992_1 600 0.00 0.00 0 o |
80 gi|671626|emb|CAA85685.1| 473 0.19 0.00 0 o | 88 gi|671626|emb|CAA85685.1| 473 0.19 0.00 0 o |
81 gi|3298468|dbj|BAA31520.1| 107 59.37 31.17 3 o23-45i52-74o89-106i | 89 gi|3298468|dbj|BAA31520.1| 107 59.37 31.17 3 o23-45i52-74o89-106i |
82 =================================== === ===== ======= ======= ==================== | 90 =================================== === ===== ======= ======= ==================== |
83 | 91 |
92 | |
84 **References** | 93 **References** |
85 | 94 |
86 Krogh, Larsson, von Heijne, and Sonnhammer. | 95 Krogh, Larsson, von Heijne, and Sonnhammer. |
87 Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes. | 96 Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes. |
88 J. Mol. Biol. 305:567-580, 2001. | 97 J. Mol. Biol. 305:567-580, 2001. |