Galaxy | Tool Preview

Map ENSEMBLIDs to Gene Symbols. (version 21.7.22)
Name of the column in your dataset containing unique FeatureIDs.
Name of the column containing the ENSEMBLIDs to use for linking to gene symbols.

Tool Description

This tool takes an annotation data file containing unique FeatureIDs and Ensembl IDs and adds gene symbols. The link from the Ensembl IDs to gene symbols is made using mygene (https://mygene.info/). The tool adds the following columns to the input annotation data file: GeneSymbol, Score, Selected and Tie.

The GeneSymbol column contains the short-form abbreviation for the gene. The Score column contains a value generated by mygene indicating how well the Ensembl ID matched the returned gene symbol(s) (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0953-9). For cases where an Ensembl ID uniquely matches to a gene symbol, the Selected column = ‘Yes”. For cases where an Ensembl ID matches to more than one gene symbol, the Selected column = ‘Yes’ for the gene symbol with the best Score value. If there is a tie, the alphabetically first gene symbol is selected and the Tie column = ‘Yes’. We note that FeatureID may not be unique in the resulting output dataset.


Input

Dataset with unique FeatureID and ENSEMBLID values

FeatureID ENSEMBLID ...
FeatureID_1 ENS... ...
FeatureID_2 ENS... ...
FeatureID_3 ENS... ...
... ... ...
NOTE: This file must contain at least two columns, a column with unique FeatureIDs and a column containing ENSEMBLIDs. Other columns may be present.

Unique FeatureID

Name of the column in your input dataset that has unique FeatureIDs.

ENSEMBLID

Name of the column containing the ENSEMBLIDs.


OUTPUT

The user will get a single output file containing the linked gene symbols.

Output Table

FeatureID ENSEMBLID ... GeneSymbol Score Selected
FeatureID_1 ENS... ... one* 13.550056 Yes
FeatureID_2 ENS... ... two* 12.984067 Yes
FeatureID_2 ENS... ... three* 11.995048 No
FeatureID_3 ENS... ... four* 12.549084 Yes
... ... ... ... ... ...

'*'=refers to the matched gene