Mercurial > repos > fernando > protein_funcional_analysis_similarities
comparison interpro/paso3.xml @ 0:c342ebb50f0b draft default tip
Uploaded
author | fernando |
---|---|
date | Thu, 22 May 2014 05:09:07 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:c342ebb50f0b |
---|---|
1 <tool id="CLaGiFer_3" name="Sequences attributes" version="1.0.0"> | |
2 <description>Download gff file from InterPro</description> | |
3 <command interpreter="bash"> | |
4 ./paso3.sh "$infile" "$outfile" | |
5 </command> | |
6 | |
7 <inputs> | |
8 <param name="infile" type="data" format="fasta" label="Fasta file"/> | |
9 </inputs> | |
10 <outputs> | |
11 <data format="gff" name="outfile"/> | |
12 </outputs> | |
13 | |
14 <stdio><exit_code range="1:" level="fatal" description="Error" /></stdio> | |
15 <help> | |
16 | |
17 | |
18 **What it does** | |
19 | |
20 Interproscan is a batch tool to query the Interpro database. It provides annotations based on multiple searches of profile and other functional databases. | |
21 | |
22 | |
23 **Dependencies** | |
24 | |
25 InterProscan package is required to be installed (http://code.google.com/p/interproscan/wiki/HowToDownload). | |
26 | |
27 | |
28 | |
29 ##### | |
30 Input | |
31 ##### | |
32 | |
33 A FASTA file containing protein sequences is required. | |
34 | |
35 | |
36 ###### | |
37 Output | |
38 ###### | |
39 | |
40 Generic Feature Format Version 3 (GFF3) | |
41 | |
42 The GFF3 format is a flat tab-delimited file, which is much richer then the TSV output format. It allows you to trace back from matches to predicted proteins and to nucleic acid sequences. It also contains a FASTA format representation of the predicted protein sequences and their matches. You will find a documentation of all the columns and attributes used on [http://www.sequenceontology.org/gff3.shtml]. | |
43 | |
44 Example Output | |
45 -------------- | |
46 | |
47 :: | |
48 | |
49 ##gff-version 3 | |
50 ##feature-ontology http://song.cvs.sourceforge.net/viewvc/song/ontology/sofa.obo?revision=1.269 | |
51 ##sequence-region AACH01000027 1 1347 | |
52 ##seqid|source|type|start|end|score|strand|phase|attributes | |
53 AACH01000027 provided_by_user nucleic_acid 1 1347 . + . Name=AACH01000027;md5=b2a7416cb92565c004becb7510f46840;ID=AACH01000027 | |
54 AACH01000027 getorf ORF 1 1347 . + . Name=AACH01000027.2_21;Target=pep_AACH01000027_1_1347 1 449;md5=b2a7416cb92565c004becb7510f46840;ID=orf_AACH01000027_1_1347 | |
55 AACH01000027 getorf polypeptide 1 449 . + . md5=fd0743a673ac69fb6e5c67a48f264dd5;ID=pep_AACH01000027_1_1347 | |
56 AACH01000027 Pfam protein_match 84 314 1.2E-45 + . Name=PF00696;signature_desc=Amino acid kinase family;Target=null 84 314;status=T;ID=match$8_84_314;Ontology_term="GO:0008652";date=15-04-2013;Dbxref="InterPro:IPR001048","Reactome:REACT_13" | |
57 ##sequence-region 2 | |
58 ... | |
59 >pep_AACH01000027_1_1347 | |
60 LVLLAAFDCIDDTKLVKQIIISEIINSLPNIVNDKYGRKVLLYLLSPRDPAHTVREIIEV | |
61 LQKGDGNAHSKKDTEIRRREMKYKRIVFKVGTSSLTNEDGSLSRSKVKDITQQLAMLHEA | |
62 GHELILVSSGAIAAGFGALGFKKRPTKIADKQASAAVGQGLLLEEYTTNLLLRQIVSAQI | |
63 LLTQDDFVDKRRYKNAHQALSVLLNRGAIPIINENDSVVIDELKVGDNDTLSAQVAAMVQ | |
64 ADLLVFLTDVDGLYTGNPNSDPRAKRLERIETINREIIDMAGGAGSSNGTGGMLTKIKAA | |
65 TIATESGVPVYICSSLKSDSMIEAAEETEDGSYFVAQEKGLRTQKQWLAFYAQSQGSIWV | |
66 DKGAAEALSQYGKSLLLSGIVEAEGVFSYGDIVTVFDKESGKSLGKGRVQFGASALEDML | |
67 RSQKAKGVLIYRDDWISITPEIQLLFTEF | |
68 ... | |
69 >match$8_84_314 | |
70 KRIVFKVGTSSLTNEDGSLSRSKVKDITQQLAMLHEAGHELILVSSGAIAAGFGALGFKK | |
71 RPTKIADKQASAAVGQGLLLEEYTTNLLLRQIVSAQILLTQDDFVDKRRYKNAHQALSVL | |
72 LNRGAIPIINENDSVVIDELKVGDNDTLSAQVAAMVQADLLVFLTDVDGLYTGNPNSDPR | |
73 AKRLERIETINREIIDMAGGAGSSNGTGGMLTKIKAATIATESGVPVYICS | |
74 | |
75 | |
76 | |
77 ---------- | |
78 References | |
79 ---------- | |
80 | |
81 | |
82 If you use this Galaxy tool in work leading to a scientific publication please | |
83 cite the following papers: | |
84 | |
85 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013). | |
86 Galaxy tools and workflows for sequence analysis with applications | |
87 in molecular plant pathology. PeerJ 1:e167 | |
88 http://dx.doi.org/10.7717/peerj.167 | |
89 | |
90 Zdobnov EM, Apweiler R (2001) | |
91 InterProScan an integration platform for the signature-recognition methods in InterPro. | |
92 Bioinformatics 17, 847-848. | |
93 http://dx.doi.org/10.1093/bioinformatics/17.9.847 | |
94 | |
95 Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R (2005) | |
96 InterProScan: protein domains identifier. | |
97 Nucleic Acids Research 33 (Web Server issue), W116-W120. | |
98 http://dx.doi.org/10.1093/nar/gki442 | |
99 | |
100 Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. (2009) | |
101 InterPro: the integrative protein signature database. | |
102 Nucleic Acids Research 37 (Database Issue), D224-228. | |
103 http://dx.doi.org/10.1093/nar/gkn785 | |
104 | |
105 | |
106 This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at | |
107 http://toolshed.g2.bx.psu.edu/view/bgruening/interproscan5 | |
108 | |
109 | |
110 **Galaxy Wrapper Author**:: | |
111 | |
112 * Fernando Pérez | |
113 * Ginés Almagro | |
114 * Laura Entrambasaguas | |
115 </help> | |
116 </tool> |