Galaxy | Tool Preview

Build Protein interaction network (version 2020.02.06)
Copy/paste or from a file (e.g. table)
For example, fill in "c1" if it is the first column, "c2" if it is the second column and so on

Description

As elementary constituents of cellular protein complexes and pathways, protein–protein interactions (PPIs) are key determinants of protein function. This tool allows building interaction maps by mapping your list of protein or gene identifiers on different public resources; indeed, according to your need and the species of interest, different PPIs resources are available (for details see “Parameters” section). The two result files generated (network and nodes attributes) can be used for viewing and further exploration of the resulting protein interactions network by directly importing those in dedicated software (e.g. Cytoscape).


Input

"Enter IDs": A list of IDs must be entered either via a copy/paste or by choosing a file. The type of identifiers allowed depends on the public resource you select (see below).

In copy/paste mode, the number of IDs considered in input is limited to 5000.


Parameters

"Select database": three databases are currently proposed according to your need and listed below

  1. BioGRID is an interaction repository with data compiled through comprehensive curation efforts. Homo sapiens, Mus musculus and Rattus norvegicus species are currently available(for more details, https://thebiogrid.org/).
  2. Bioplex (biophysical interactions of ORFeome-based complexes) network is the result of creating thousands of cell lines with each expressing a tagged version of a protein from the ORFeome collection. Immunopurification of the tagged protein and detection of associated proteins by mass spectrometry are the building blocks of the network (for more details, http://bioplex.hms.harvard.edu/)
  3. HuMAP (Human Protein Complex Map) is one of the most comprehensive view of human protein complexes; built by integrating large scale affinity purification mass spectrometry (AP/MS) datasets with dataset of large scale biochemical fractionations (for more details, http://proteincomplexes.org/about). We recommend to select this resource for exploring human protein complexes.

"Type/source of IDs": correspond to the type of your identifiers you have. Note that only Entrez gene Id and Uniprot Accession number are allowed. If you don't have this type please use the "ID-converter" tool from ProteoRE.

"Species": must be specified if using Biogrid as PPI database (i.e. Homo sapiens, Mus musculus and Rattus norvegicus). If Bioplex or HuMAP are selected, then species is automatically set to Human (Homo sapiens) displaying the release date.


Output:

Two output files are created with the follwing prefix "Network_PPIdatabaseName_" and a "Nodes_PPIdatabaseName_" (where "PPIdatabaseName" correspond to the PPI database selected). The "Network" file contains information related to each interaction between two proteins (one row per binary interaction) while the "Nodes" file contains attributes (i.e. annotation, information) related to each gene/protein. Below is shown a brief example of each output file when BioGRID is selected. Note that a "NA" is added when there is no available information.

"Network" output file (example):

Network file (if BioGRID database selected - simulated data)
Entrez_Gene_Interactor_A Entrez Gene Interactor B Gene symbol Interactor A Gene symbol Interactor B Experimental System Experimental Type Pubmed ID Interaction Score Phenotypes
1 368 A1BG ABCC6 Two-hybrid physical 21988832 NA Growth abnormality
1 10549 A1BG PRDX4 Negative Genetic genetic 21988832 NA NA
1 9923 A1BG ZBTB40 Affinity Capture-MS physical 28514442 0.99977983 NA

"Interaction Score": a positive for negative value recorded by the original publication depicting P-Values, Confidence Score, SGA Score, etc. Will be “NA” if no score is reported.

"Nodes" output file (example):

Nodes file (if BioGRID database selected - simulated data)
Entrez gene ID Official Symbol Interactor Present in user input ids ID present in Biogrid Human Pathway
1 A1BG True True Platelet degranulation ;Neutrophil degranulation
10 NAT2 False True Acetylation
12 SERPINA3 True False NA

These 2 files can be directly imported into a visualization software (such as Cytoscape - https://cytoscape.org/download.html) for further exploration and analysis of the newly created biological network.


Data source (release date)

This tool uses the following public ressources (for more details please check: http://www.proteore.org/static/data_source.html)

Data were downloaded from BioGrid: https://downloads.thebiogrid.org/BioGRID/

Installation date:

06/02/2020

01/03/2019

BioPlex_interactionList_v4a.tsv: http://bioplex.hms.harvard.edu/data/BioPlex_interactionList_v4a.tsv

nodeTable.txt: http://proteincomplexes.org/static/downloads/nodeTable.txt

pairsWprob: http://proteincomplexes.org/static/downloads/pairsWprob.txt

Mapping files linking the source database identifier (Entrez gene ID and Uniprot Accession Number) to the Reactome pathways are based on 2018-12-07 version, NCBI2Reactome.txt and UniProt2Reactome.txt (from https://www.reactome.org/download-data).


Galaxy integration

David Christiany, Lisa Perus, Florence Combes, Yves Vandenbrouck - CEA, INSERM, CNRS, Grenoble-Alpes University, BIG Institute, FR

Sandra Dérozier, Olivier Rué, Valentin Loux - INRA, Paris-Saclay University, MAIAGE Unit, Migale Bioinformatics platform, FR

This work has been partially funded through the French National Agency for Research (ANR) IFB project.

Help: contact@proteore.org for any questions or concerns about the Galaxy implementation of this tool.