Mercurial > repos > recetox > recetox_aplcms_compute_clusters
diff help.xml @ 0:82737757f3d5 draft
planemo upload for repository https://github.com/RECETOX/galaxytools/tree/master/tools/recetox_aplcms commit 506df2aef355b3791567283e1a175914f06b405a
author | recetox |
---|---|
date | Mon, 13 Feb 2023 10:27:56 +0000 |
parents | |
children | ce00e1d03c31 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/help.xml Mon Feb 13 10:27:56 2023 +0000 @@ -0,0 +1,255 @@ +<macros> + +<token name="@GENERAL_HELP@"> +General Information +=================== + +Overview +-------- + +recetox-aplcms is a software package for peak detection in high resolution mass spectrometry (HRMS) data. +It supports reading .mzml files in raw profile mode and uses a bi-Gaussian chromatographic peak shape for feature detection and quantification. + +recetox-aplcms is based on the apLCMS package developed by Tianwei Yu at Emory University - see the citations and the apLCMS section beneath. +This version includes various software updates and is actively developed and maintained on `GitHub`_. +Please submit eventual bug reports as `issues`_ on the repository. + +.. _GitHub: https://github.com/RECETOX/recetox-aplcms +.. _issues: https://github.com/RECETOX/recetox-aplcms/issues/new + + +Workflow +-------- + +.. image:: https://raw.githubusercontent.com/RECETOX/galaxytools/aee0dd6cf6c05936269efe4337c50e27cc68e86b/tools/recetox_aplcms/images/scheme.png + :width: 2560 + :height: 788 + :scale: 40 + :alt: A picture of a workflow diagram. + +The individual steps of the recetox-aplcms package can be combined in 2 separate workflows processing HRMS data in an unsupervised manner or by including a-priori knowledge. +The workflows consist of the following building blocks: + +(1) remove noise - denoise the raw data and extract the EIC +(2) generate feature table - group features in EIC into peaks using peak-shape model +(3) compute clusters - compute mz and rt clusters across samples +(4) compute template - find the template for rt correction +(5) correct time - correct the rt across samples using splines +(6) align features - align identical features across samples +(7) recover weaker signals - recover missed features in samples based on the aligned features +(8) merge known table - add known features to detected features table and vice versa + +For detailed documentation on the individual steps please see the individual tool wrappers. + + +apLCMS (Original Reference) +--------------------------- + +apLCMS is a software which generates a feature table from a batch of LC/MS spectra. The m/z and retention time +tolerance levels are estimated from the data. A run-filter is used to detect peaks and remove noise. +Non-parametric statistical methods are used to find-tune peak selection and grouping. After retention time +correction, a feature table is generated by aligning peaks across spectra. For further information on apLCMS +please refer to https://mypage.cuhk.edu.cn/academics/yutianwei/apLCMS/. +</token> + +<token name="@REMOVE_NOISE_HELP@"> +recetox-aplcms - remove noise +============================= + +This tool is the first step of recetox-aplcms. +It removes noise from the raw data and performs a first clustering step of points with close m/z values into the extracted ion chromatograms (EICs). +Only peaks with a minimum elution length of `min_run` seconds are kept. + +Example Output +-------------- +The raw data points contained in the scans of the `mzml` file are filtered for noise and grouped into clusters based on m/z values. +See an example output in the table below. The `group_number` column indicates the cluster index. + ++----------------------+-------------------+-----------------------+--------------------+ +| mz | rt | intensity | group_number | ++======================+===================+=======================+====================+ +| 70.01060119055192 | 350.58654 | 21178.330810546875 | 5 | ++----------------------+-------------------+-----------------------+--------------------+ +| 70.02334120404554 | 130.175262 | 287869.5478515625 | 10 | ++----------------------+-------------------+-----------------------+--------------------+ +| 70.0287408273165 | 134.801352 | 60883.15185546875 | 11 | ++----------------------+-------------------+-----------------------+--------------------+ +| 70.02872416715464 | 183.991896 | 9201.574584960938 | 11 | ++----------------------+-------------------+-----------------------+--------------------+ +| ... | ... | ... | ... | ++----------------------+-------------------+-----------------------+--------------------+ +</token> + +<token name="@GENERATE_FEATURE_TABLE_HELP@"> +recetox-aplcms - generate feature table +======================================= +The second step in the recetox-aplcms workflow performing peak shape parameter estimation. + +This tool takes the grouped features created with `recetox-aplcms-remove-noise` and computes the peak shape in `rt` domain and integrates the peak area. + + +Example Output +-------------- +The output contains the `mz` and `rt` of the peaks as well as the standard deviation in both direction of the peak for the bi-gaussian peak shape. + ++----------------------+-------------------+-----------------+-------------------+----------------------+ +| mz | rt | sd1 | sd2 | area | ++======================+===================+=================+===================+======================+ +| 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.24595184 | ++----------------------+-------------------+-----------------+-------------------+----------------------+ +| 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | ++----------------------+-------------------+-----------------+-------------------+----------------------+ +| 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.50659719 | ++----------------------+-------------------+-----------------+-------------------+----------------------+ +| ... | ... | ... | ... | ... | ++----------------------+-------------------+-----------------+-------------------+----------------------+ +</token> + +<token name="@COMPUTE_CLUSTERS_HELP@"> +recetox-aplcms - compute clusters +================================= + +Group features with `mz` and `rt` using tolerances within the tolerance into clusters, creating larger features from raw data points. +Custom tolerances for `mz` and `rt` are computed based on the given parameters. +The tool takes a collection of all detected features and computes the clusters over a global feature table, adding the `sample_id` and `cluster` columns to the table. + +Example Output +-------------- + ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| mz | rt | sd1 | sd2 | area | sample_id | cluster | ++======================+===================+=================+===================+======================+=====================+===============+ +| 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.245951841 | 21_qc_no_dil_milliq | 7 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | 21_qc_no_dil_milliq | 9 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.506597194 | 21_qc_no_dil_milliq | 13 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| ... | ... | ... | ... | ... | ... | ... | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +</token> + +<token name="@CORRECT_TIME_HELP@"> +recetox-aplcms - correct time +============================= + +Apply spline-based retention time correction to a feature table given the template table and the computed `mz` and `rt` tolerances. + +Example Output +-------------- +The output has the same format as `compute clusters` but the retention time values are corrected based on the template table. + ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| mz | rt | sd1 | sd2 | area | sample_id | cluster | ++======================+===================+=================+===================+======================+=====================+===============+ +| 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.245951841 | 21_qc_no_dil_milliq | 7 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | 21_qc_no_dil_milliq | 9 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.506597194 | 21_qc_no_dil_milliq | 13 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| ... | ... | ... | ... | ... | ... | ... | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +</token> +<token name="@COMPUTE_TEMPLATE_HELP@"> +recetox-aplcms - compute template +================================= +Compute the template from a set of feature tables, choosing the one with the most features as the template. +</token> + +<token name="@RECOVER_WEAKER_SIGNALS_HELP@"> +recetox-aplcms - recover weaker signals +======================================= +Second stage peak detection based on the aligned feature table from the `feature alignment` step. +If a feature is contained in the aligned feature table, this step revisits the raw data and searches +for this feature at the retention time obtained by mapping the corrected retention time back to the original sample. + +This recovers features which are present in a sample but might have been filtered out initially as noise due to low signal intensity. + +Example Output +-------------- +The table has the same format as the `compute clusters` output but might contain additional features which have been extracted based +on their presence in the aligned feature table. + ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| mz | rt | sd1 | sd2 | area | sample_id | cluster | ++======================+===================+=================+===================+======================+=====================+===============+ +| 70.02317542938793 | 142.36033 | 11.436659559 | 14.592754933 | 4159269.245951841 | 21_qc_no_dil_milliq | 7 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| 70.02869594233522 | 205.48765 | 0.263230763 | 0.285101428707 | 8849767.11861127 | 21_qc_no_dil_milliq | 9 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| 78.04643252598305 | 294.01713 | 0.51677558617 | 1.317028944141 | 1333044.506597194 | 21_qc_no_dil_milliq | 13 | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +| ... | ... | ... | ... | ... | ... | ... | ++----------------------+-------------------+-----------------+-------------------+----------------------+---------------------+---------------+ +</token> + +<token name="@ALIGN_FEATURES_HELP@"> +recetox-aplcms - align features +=============================== +This step performs feature alignment after clustering and retention time correction. +The peaks clustered across samples are grouped based on the given tolerances to create an aligned feature table, connecting identical features across samples. +The parameter controls in how many samples a feature has to be detected at least in order to be included in the aligned feature table. + +Example Output +-------------- +The tool outputs 3 tables: the peak related `metadata`, the `retention times` and the `intensities` for all features across all samples. + +Metadata Table +~~~~~~~~~~~~~~ +The `npeaks` column denotes the number of peaks which have been grouped into this feature. The columns with the sample names indicate whether this feature is present in the sample. + ++-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ +| id | mz | mzmin | mzmax | rt | rtmin | rtmax | npeaks | 21_qc_no_dil_milliq | 29_qc_no_dil_milliq | 8_qc_no_dil_milliq | ++=======+==============+==============+===============+================+===============+===============+===========+========================+========================+========================+ +| 1 | 70.03707021 | 70.037066 | 70.0370750 | 294.1038014 | 294.0634942 | 294.149985 | 3 | 1 | 1 | 1 | ++-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ +| 2 | 70.06505677 | 70.065045 | 70.0650676 | 141.9560055 | 140.5762528 | 143.335758 | 2 | 1 | 0 | 1 | ++-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ +| 57 | 78.04643252 | 78.046429 | 78.0464325 | 294.0063397 | 293.9406777 | 294.072001 | 2 | 1 | 1 | 0 | ++-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ +| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ++-------+--------------+--------------+---------------+----------------+---------------+---------------+-----------+------------------------+------------------------+------------------------+ + +Intensity Table +~~~~~~~~~~~~~~~ +This table contains the peak area for aligned features in all samples. + ++-------+------------------------+------------------------+------------------------+ +| id | 21_qc_no_dil_milliq | 29_qc_no_dil_milliq | 8_qc_no_dil_milliq | ++=======+========================+========================+========================+ +| 1 | 13187487.20482895 | 7957395.699119729 | 11700594.397257797 | ++-------+------------------------+------------------------+------------------------+ +| 2 | 2075168.6398983458 | 0 | 2574362.159289044 | ++-------+------------------------+------------------------+------------------------+ +| 57 | 2934524.4406785755 | 1333044.5065971944 | 0 | ++-------+------------------------+------------------------+------------------------+ +| ... | ... | ... | ... | ++-------+------------------------+------------------------+------------------------+ + +Retention Time Table +~~~~~~~~~~~~~~~~~~~~ +This table contains the retention times for all aligned features in all samples. + ++-------+------------------------+------------------------+------------------------+ +| id | 21_qc_no_dil_milliq | 29_qc_no_dil_milliq | 8_qc_no_dil_milliq | ++=======+========================+========================+========================+ +| 1 | 294.09792478513236 | 294.1499853056912 | 294.0634942428341 | ++-------+------------------------+------------------------+------------------------+ +| 2 | 140.57625284242982 | 0 | 143.33575827589172 | ++-------+------------------------+------------------------+------------------------+ +| 57 | 294.07200187644435 | 293.9406777222317 | 0 | ++-------+------------------------+------------------------+------------------------+ +| ... | ... | ... | ... | ++-------+------------------------+------------------------+------------------------+ +</token> + +<token name="@MERGE_KNOWN_TABLES_HELP@"> +recetox-aplcms - merge known table +================================== + +This tool allows merging the detected features back into the table of known features and vice versa. +It is used in the hybrid version of recetox-aplcms to augment the aligned feature table with the suspect peaks +and to augment this table with successfully detected features. +</token> +</macros>