Mercurial > repos > recetox > ipapy2_gibbs_sampler_add

<tool id="ipapy2_gibbs_sampler_add" name="ipaPy2 gibbs sampler additive" version="@TOOL_VERSION@+galaxy0" profile="@PROFILE@">
    <macros>
        <import>macros.xml</import>
    </macros>

    <expand macro="requirements"/>

    <command detect_errors="exit_code"><![CDATA[
        python3 '${__tool_directory__}/ipapy2_gibbs_sampler_add.py'
        --input_dataset_mapped_isotope_patterns '${mapped_isotope_patterns}' '${mapped_isotope_patterns.ext}'
        --input_dataset_annotations '${annotations}' '${annotations.ext}'
        --noits '${noits}'
        --burn '${burn}'
        --delta_add '${delta_add}'
        --all_out '${all_out}'
        #if $zs:
            --zs '${zs}' '${zs.ext}'
        #else:
            --zs '' ''
        #end if
        #if $zs_out:
            --zs_out '${zs_out}' '${zs_out.ext}'
        #else:
            --zs_out '' ''
        #end if
        --output_dataset '${annotations_out}' '${annotations_out.ext}'
    ]]></command>

    <inputs>
        <expand macro="gibbs"/>
        <param name="delta_add" type="float" value="1" min="0" label="adducts weight" help="parameter used when computing the conditional priors. The parameter must be positive. The smaller the parameter the more weight the adducts connections have on the posterior probabilities. Default 1." />
    </inputs>

    <outputs>
        <data label="${tool.name} annotations on ${on_string}" name="annotations_out" format_source="mapped_isotope_patterns"/>
        <data label="${tool.name} zs on ${on_string}" name="zs_out" format="txt">
            <filter>options['all_out']</filter>
        </data>
    </outputs>

    <tests>
        <test  expect_num_outputs="2">
            <param name="mapped_isotope_patterns" value="mapped_isotope_patterns.csv"/>
            <param name="annotations" value="clean_annotations.csv"/>
            <param name="noits" value="1000"/>
            <param name="delta_add" value="0.1"/>
            <!-- Not the best way to test, but the results are stochastic hence difficult to test-->
            <output name="annotations_out">
                <assert_contents>
                    <has_n_columns n="15"  sep=","/>
                    <has_n_lines n="15" delta="5" />
                    <has_line line="id,name,formula,adduct,m/z,charge,RT range,ppm,isotope pattern score,fragmentation pattern score,prior,post,post Gibbs,chi-square pval,peak_id" />
                </assert_contents>
            </output>
        </test>
    </tests>

    <help><![CDATA[

.. _ipapy2_gibbs_sampler_add:

=======================================
ipaPy2 Gibbs Sampler Additive Tool
=======================================

**Tool Description**

This tool implements a Gibbs sampler for the IPA (Integrated Probabilistic Annotation) model, focusing on additive (adducts-based) connections between features. It refines metabolite annotation probabilities by iteratively sampling from the posterior distribution, considering relationships between features that can be explained by known adduct transformations.

How it works
------------

- The Gibbs sampler updates annotation probabilities by considering **adducts connections**: relationships between features that can be explained by known adduct transformations.
- The influence of adducts connections is controlled by the ``adducts weight`` (`delta_add`) parameter: smaller values increase the influence of adducts connections on the posterior probabilities.
- The process is stochastic, so results may vary between runs.
- Optionally, the sampler can output the sampled states (`zs_out`) for further analysis.

Inputs
------

1. **Mapped isotope patterns**
   Dataset containing mapped isotope patterns (e.g., output from the ipaPy2 map isotope patterns tool).

2. **Annotations**
   Initial annotation table to be refined by the Gibbs sampler.

3. **Number of iterations (`noits`)**
   Number of Gibbs sampler iterations to perform.

4. **Burn-in (`burn`)**
   Number of initial iterations to discard (burn-in period).

5. **Adducts weight (`delta_add`)**
   Parameter used when computing the conditional priors. Must be positive. The smaller the value, the more weight adducts connections have on the posterior probabilities (default: 1).

6. **All out (`all_out`)**
   If enabled, outputs all intermediate results including sampled states.

7. **zs**
   (Optional) Input file for initial state of the sampler.

Outputs
-------

- **annotations_out**
  Refined annotation table with updated posterior probabilities.

- **zs_out**
  (Optional) File containing sampled states from the Gibbs sampler (if `all_out` is enabled).

Example
-------

Suppose you have mapped isotope patterns and an initial annotation table. You can run the Gibbs sampler as follows:

.. code-block::

    mapped_isotope_patterns.csv
    clean_annotations.csv

Set the number of iterations (e.g., ``noits = 1000``) and the adducts weight (e.g., ``delta_add = 0.1``), then run the tool. The output will be a refined annotation table and, optionally, a file with sampled states.

Notes
-----

- The results are stochastic; repeated runs may yield slightly different outputs.
- For best results, ensure all input files are correctly formatted and contain the required columns.
- The tool supports multiple file formats (CSV, TSV, Parquet, Tabular) for flexibility.

References
----------

- For more details on the Gibbs sampling algorithm and its application in metabolomics, refer to the ipaPy2 documentation or associated publications.

    ]]></help>

    <expand macro="citations"/>
</tool>
author	recetox
date	Fri, 16 May 2025 08:01:46 +0000
parents
children