Galaxy | Tool Preview

Unipept (version 4.5.1)
isoleucine (I) and leucine (L) are equated when matching tryptic peptides to UniProt records
Return the complete lineage of the taxonomic lowest common ancestor, and include ID fields.
return the names in complete taxonomic lineage
include fields for most specific taxonomic classification: taxon_rank,taxon_id,taxon_name before lineage

Unipept

Retrieve Uniprot and taxanomic information for trypic peptides.

Unipept API documentation - https://unipept.ugent.be/apidocs

Input

Input peptides can be retrieved from tabular, fasta, mzid, or pepxml datasets.

Processing deatils:

The input peptides are split into typtic peptide fragments in order to match the Unipept records.
Only fragments that are complete tryptic peptides between 5 and 50 animo acid in length will be matched by Unipept.
The match to the most specific tryptic fragment is reported.

Unipept APIs

pept2prot - https://unipept.ugent.be/apidocs/pept2prot

Returns the list of UniProt entries containing a given tryptic peptide. This is the same information as provided on the Protein matches tab when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record:

peptide: the peptide that matched this record
uniprot_id: the UniProt accession number of the matching record
taxon_id: the NCBI taxon id of the organism associated with the matching record

When the extra parameter is set to true, objects contain the following additional fields extracted from the UniProt record:

taxon_name: the name of the organism associated with the matching UniProt record
ec_references: a space separated list of associated EC numbers
go_references: a space separated list of associated GO terms
refseq_ids: a space separated list of associated RefSeq accession numbers
refseq_protein_ids: a space separated list of associated RefSeq protein accession numbers
insdc_ids: a space separated list of associated insdc accession numbers
insdc_protein_ids: a space separated list of associated insdc protein accession numbers

pept2taxa - http://unipept.ugent.be/apidocs/pept2taxa

Returns the set of organisms associated with the UniProt entries containing a given tryptic peptide. This is the same information as provided on the Lineage table tab when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record and NCBI taxonomy:

peptide: the peptide that matched this record
taxon_id: the NCBI taxon id of the organism associated with the matching record
taxon_name: the name of the organism associated with the matching record
taxon_rank: the taxonomic rank of the organism associated with the matching record

When the extra parameter is set to true, objects contain additional information about the lineages of the organism extracted from the NCBI taxonomy. The taxon id of each rank in the lineage is specified using the following information fields:

superkingdom_id
kingdom_id
subkingdom_id
superphylum_id
phylum_id
subphylum_id
superclass_id
class_id
subclass_id
infraclass_id
superorder_id
order_id
suborder_id
infraorder_id
parvorder_id
superfamily_id
family_id
subfamily_id
tribe_id
subtribe_id
genus_id
subgenus_id
species_group_id
species_subgroup_id
species_id
subspecies_id
varietas_id
forma_id

pept2lca - https://unipept.ugent.be/apidocs/pept2lca

Returns the taxonomic lowest common ancestor for a given tryptic peptide. This is the same information as provided when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record and NCBI taxonomy:

peptide: the peptide that matched this record
taxon_id: the NCBI taxon id of the organism associated with the matching record
taxon_name: the name of the organism associated with the matching record
taxon_rank: the taxonomic rank of the organism associated with the matching record

When the extra parameter is set to true, objects contain additional information about the lineage of the taxonomic lowest common ancestor extracted from the NCBI taxonomy. The taxon id of each rank in the lineage is specified using the following information fields:

superkingdom_id
kingdom_id
subkingdom_id
superphylum_id
phylum_id
subphylum_id
superclass_id
class_id
subclass_id
infraclass_id
superorder_id
order_id
suborder_id
infraorder_id
parvorder_id
superfamily_id
family_id
subfamily_id
tribe_id
subtribe_id
genus_id
subgenus_id
species_group_id
species_subgroup_id
species_id
subspecies_id
varietas_id
forma_id

pept2ec - https://unipept.ugent.be/apidocs/pept2ec

Returns the functional EC-numbers associated with a given tryptic peptide. This is the same information as provided when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record and NCBI taxonomy:

peptide: the peptide that matched this record
total_protein_count: Total amount of proteins matched with the given peptide
ec_number: EC-number associated with the current tryptic peptide.
protein_count: amount of proteins matched with the given tryptic peptide that are labeled with the current EC-number.
name: Optional, name of the EC-number. Included when the extra parameter is set to true.

pept2go - https://unipept.ugent.be/apidocs/pept2go

Returns the functional GO-terms associated with a given tryptic peptide. This is the same information as provided when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record and NCBI taxonomy:

peptide: the peptide that matched this record
total_protein_count: Total amount of proteins matched with the given peptide
go_term: The GO-term associated with the current tryptic peptide.
protein_count: amount of proteins matched with the given tryptic peptide that are labeled with the current GO-term.
name: Optional, name of the GO-term. Included when the extra parameter is set to true.

pept2interpro - https://unipept.ugent.be/apidocs/pept2interpro

Returns the functional InterPro entries associated with a given tryptic peptide. This is the same information as provided when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record and NCBI taxonomy:

peptide: the peptide that matched this record
total_protein_count: Total amount of proteins matched with the given peptide
code: InterPro entry code associated with the current tryptic peptide
protein_count: amount of proteins matched with the given tryptic peptide that are labeled with the current InterPro code.
type: Optional, type of the InterPro entry. Included when the extra parameter is set to true.
name: Optional, name of the InterPro entry. Included when the extra parameter is set to true.

pept2funct - https://unipept.ugent.be/apidocs/pept2funct

Returns the functional EC-numbers and GO-terms associated with a given tryptic peptide. This is the same information as provided when performing a search with the Tryptic Peptide Analysis in the web interface.

By default, each object contains the following information fields extracted from the UniProt record and NCBI taxonomy:

peptide: the peptide that matched this record
total_protein_count: Total amount of proteins matched with the given peptide
ec_number: EC-number associated with the current tryptic peptide.
protein_count: amount of proteins matched with the given tryptic peptide that are labeled with the current EC-number.
name: Optional, name of the EC-number. Included when the extra parameter is set to true.
go_term: The GO-term associated with the current tryptic peptide.
protein_count: amount of proteins matched with the given tryptic peptide that are labeled with the current GO-term.
name: Optional, name of the GO-term. Included when the extra parameter is set to true.
code: InterPro entry code associated with the current tryptic peptide
protein_count: amount of proteins matched with the given tryptic peptide that are labeled with the current InterPro code.
type: Optional, type of the InterPro entry. Included when the extra parameter is set to true.
name: Optional, name of the InterPro entry. Included when the extra parameter is set to true.

Attributions

The Unipept metaproteomics analysis pipeline Bart Mesuere1,*, Griet Debyser2, Maarten Aerts3, Bart Devreese2, Peter Vandamme3 andPeter Dawyndt1 Article first published online: 11 FEB 2015 DOI: 10.1002/pmic.201400361