Galaxy | Tool Preview

Authors

Gabriel Cretin (for perl and Galaxy), Yann Guitton (for R version and tests) and Franck Giacomoni (for perl and Galaxy)

If you use this tool, please cite MassBank

for `Golm Metabolome Database <http://gmd.mpimp-golm.mpg.de/>`_ :
`Hummel, J., Strehmel, N., Selbig, J., Walther, D. and Kopka, J. (2010) Decision tree supported substructure prediction of metabolites from GC-MS profiles, Metabolomics. <http://dx.doi.org/10.1007/s11306-010-0198-7>`_

Description

The Golm Metabolome Database (GMD) facilitates the search for and dissemination of reference mass spectra from biologically active metabolites quantified using gas chromatography (GC) coupled to mass spectrometry (MS). This tool intends to facilitate the annotation of masses from GC-MS by searching informations through GMD webservices.

Input files

Parameter: inputSpectra
Format : msp

A file containing spectra in the msp format. Example of a spectra in msp format:

Name: Unknown1
DB.idx: -1
rt: 10.58
Class: Unknown
rt.sd: 0.003
Num Peaks: 19
73.0465 826983.38; 74.0481 70018.08; 75.0319 69475.73; 100.0573 37477.24; 103.0227 43054.28;
116.0884 1433179.62; 117.0905 151975.23; 118.0869 53105.64; 128.0526 26404.77; 131.0359 22647.44;
133.0438 22141.56; 147.0666 255488.28; 48.066 49965.66; 149.0551 37762.38; 190.1069 72568.23;
191.1063 18017.34; 192.1023 6460.8; 207.0333 35435.81; 218.1028 30528.82;

Parameters

Would you use a file

Choose whether the masses are in a file or entered manually
YES (default) : parameters File of masses (format: msp) is visible
NO : parameter Masses of the molecule (entered manually) is visible
For both, all other parameters are available

Column type

VAR5 means a 5%-phenyl-95%-dimethylpolysiloxane column and MDN35 means a 35%-phenyl-65%-dimethylpolysiloxane column. If you don't know select 'None'.

Alkane Retention Index

If neither an alkane RIs for VAR5 nor MDN35 is available in your setup, please select 'none' in the input field above!

Retention Index Window

This value is for the library search used only. A larger window size will increase the number of matches.
At the same time the identification becomes less reliable due to false matching spectra without RI consensus.
The maximal number of hits returned from the data base is limited due to performance reasons.

Maximum Hits

Maximum number of hits returned by Golm database, default = 0 (which means all of them are taken in account).

Number of significant decimal

Number of significant decimals of your m/z.
Example: m/z = 73.798 if mzRes = 4, m/z becomes 73.7980
m/z = 73.798 if mzRes = 0, m/z becomes 74

Maximum number of ions

Number of m/z and intensities per spectra you want to keep for the queries to Golm, default = 0 = all of them.

JaccardDistanceThreshold

Number of matches (a mass with appropriate intensity in both spectra) divided by the sum of matches and mismatches (a mass where only one of both spectra has a intensity).
The jaccard distance is a binary distance.

s12GowerLegendreDistanceThreshold

The distance measure S12GowLeg = sqrt(1 - s12) is derived from the S12 coefficient of Gower & Legendre defined as s12 = a / sqrt((a + b)(a + c)), with "a" representing the number of positions at which both spectra are in "on-state" and "b" respectively "c" representing the number of positions at which only the query spectrum or the hit spectrum are in "on-state".

DotproductDistanceThreshold

The Dotproduct distance is summing the multiplied intensities over all matching peaks within both spectra. Here, to satisfy the conditions of a metric I) non-negativity, II) identity of indiscernibles, III) symmetry and IV) subadditivity / triangle inequality, we use 1-Dotproduct. Both spectra are normalised prior to the spectral vector norm in that way, that the absolute value of the squared intensities is equal to 1.

HammingDistanceThreshold

In information theory, the Hamming distance between two strings of equal length is the number of positions for which the corresponding symbols are different. Put another way, it measures the minimum number of substitutions required to change one into the other, or the number of errors that transformed one string into the other.

EuclideanDistanceThreshold

The Euclid is the square root of the sum of the squared differences over all matching peaks.

Type of intensities

Use absolute or relative intensities.
Example: relative = percentage (intensity * 100) / max(intensities), absolute = untouched

Output files

Tree types of files

GOLM.html : to view results on a webpage (HTML).
GOLM.xlsx : to get results in a excel like format.
GOLM.tabular : to get results in tabular format.

Working example

Refer to the corresponding W4M HowTo section: http://workflow4metabolomics.org/howto
Format Data For Postprocessing
Perform GCMS Annotations

Docutils System Messages

System Message: ERROR/3 (<string>, line 17); backlink

Unknown target name: "golm metabolome database &lt;http://gmd.mpimp-golm.mpg.de/&gt;".

System Message: ERROR/3 (<string>, line 16); backlink

Unknown target name: "hummel, j., strehmel, n., selbig, j., walther, d. and kopka, j. (2010) decision tree supported substructure prediction of metabolites from gc-ms profiles, metabolomics. &lt;http://dx.doi.org/10.1007/s11306-010-0198-7&gt;".