Mercurial > repos > fernando > protein_funcional_analysis_similarities
view interpro/paso3.xml @ 0:c342ebb50f0b draft default tip
Uploaded
author | fernando |
---|---|
date | Thu, 22 May 2014 05:09:07 -0400 |
parents | |
children |
line wrap: on
line source
<tool id="CLaGiFer_3" name="Sequences attributes" version="1.0.0"> <description>Download gff file from InterPro</description> <command interpreter="bash"> ./paso3.sh "$infile" "$outfile" </command> <inputs> <param name="infile" type="data" format="fasta" label="Fasta file"/> </inputs> <outputs> <data format="gff" name="outfile"/> </outputs> <stdio><exit_code range="1:" level="fatal" description="Error" /></stdio> <help> **What it does** Interproscan is a batch tool to query the Interpro database. It provides annotations based on multiple searches of profile and other functional databases. **Dependencies** InterProscan package is required to be installed (http://code.google.com/p/interproscan/wiki/HowToDownload). ##### Input ##### A FASTA file containing protein sequences is required. ###### Output ###### Generic Feature Format Version 3 (GFF3) The GFF3 format is a flat tab-delimited file, which is much richer then the TSV output format. It allows you to trace back from matches to predicted proteins and to nucleic acid sequences. It also contains a FASTA format representation of the predicted protein sequences and their matches. You will find a documentation of all the columns and attributes used on [http://www.sequenceontology.org/gff3.shtml]. Example Output -------------- :: ##gff-version 3 ##feature-ontology http://song.cvs.sourceforge.net/viewvc/song/ontology/sofa.obo?revision=1.269 ##sequence-region AACH01000027 1 1347 ##seqid|source|type|start|end|score|strand|phase|attributes AACH01000027 provided_by_user nucleic_acid 1 1347 . + . Name=AACH01000027;md5=b2a7416cb92565c004becb7510f46840;ID=AACH01000027 AACH01000027 getorf ORF 1 1347 . + . Name=AACH01000027.2_21;Target=pep_AACH01000027_1_1347 1 449;md5=b2a7416cb92565c004becb7510f46840;ID=orf_AACH01000027_1_1347 AACH01000027 getorf polypeptide 1 449 . + . md5=fd0743a673ac69fb6e5c67a48f264dd5;ID=pep_AACH01000027_1_1347 AACH01000027 Pfam protein_match 84 314 1.2E-45 + . Name=PF00696;signature_desc=Amino acid kinase family;Target=null 84 314;status=T;ID=match$8_84_314;Ontology_term="GO:0008652";date=15-04-2013;Dbxref="InterPro:IPR001048","Reactome:REACT_13" ##sequence-region 2 ... >pep_AACH01000027_1_1347 LVLLAAFDCIDDTKLVKQIIISEIINSLPNIVNDKYGRKVLLYLLSPRDPAHTVREIIEV LQKGDGNAHSKKDTEIRRREMKYKRIVFKVGTSSLTNEDGSLSRSKVKDITQQLAMLHEA GHELILVSSGAIAAGFGALGFKKRPTKIADKQASAAVGQGLLLEEYTTNLLLRQIVSAQI LLTQDDFVDKRRYKNAHQALSVLLNRGAIPIINENDSVVIDELKVGDNDTLSAQVAAMVQ ADLLVFLTDVDGLYTGNPNSDPRAKRLERIETINREIIDMAGGAGSSNGTGGMLTKIKAA TIATESGVPVYICSSLKSDSMIEAAEETEDGSYFVAQEKGLRTQKQWLAFYAQSQGSIWV DKGAAEALSQYGKSLLLSGIVEAEGVFSYGDIVTVFDKESGKSLGKGRVQFGASALEDML RSQKAKGVLIYRDDWISITPEIQLLFTEF ... >match$8_84_314 KRIVFKVGTSSLTNEDGSLSRSKVKDITQQLAMLHEAGHELILVSSGAIAAGFGALGFKK RPTKIADKQASAAVGQGLLLEEYTTNLLLRQIVSAQILLTQDDFVDKRRYKNAHQALSVL LNRGAIPIINENDSVVIDELKVGDNDTLSAQVAAMVQADLLVFLTDVDGLYTGNPNSDPR AKRLERIETINREIIDMAGGAGSSNGTGGMLTKIKAATIATESGVPVYICS ---------- References ---------- If you use this Galaxy tool in work leading to a scientific publication please cite the following papers: Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013). Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology. PeerJ 1:e167 http://dx.doi.org/10.7717/peerj.167 Zdobnov EM, Apweiler R (2001) InterProScan an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847-848. http://dx.doi.org/10.1093/bioinformatics/17.9.847 Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R (2005) InterProScan: protein domains identifier. Nucleic Acids Research 33 (Web Server issue), W116-W120. http://dx.doi.org/10.1093/nar/gki442 Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. (2009) InterPro: the integrative protein signature database. Nucleic Acids Research 37 (Database Issue), D224-228. http://dx.doi.org/10.1093/nar/gkn785 This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/bgruening/interproscan5 **Galaxy Wrapper Author**:: * Fernando Pérez * Ginés Almagro * Laura Entrambasaguas </help> </tool>