Mercurial > repos > bgruening > xchem_transfs_scoring

<tool id="xchem_transfs_scoring" name="XChem TransFS pose scoring" version="0.4.0">
    <description>using deep learning</description>

    <requirements>
        <!--requirement type="package" version="3.0.0">openbabel</requirement-->
        <!--requirement type="package" version="3.7">python</requirement-->
        <!-- many other requirements are needed -->
        <container type="docker">informaticsmatters/transfs:1.3</container>
    </requirements>
    <command detect_errors="exit_code"><![CDATA[

    cd /train/fragalysis_test_files/ &&
    mkdir -p workdir/tfs &&
    cd workdir &&

    cp '$ligands' ligands.sdf &&
    cp '$receptor' receptor.pdb &&

    cd ../ &&
    python transfs.py -i /train/fragalysis_test_files/workdir/ligands.sdf -r /train/fragalysis_test_files/workdir/receptor.pdb -d $distance -w workdir/tfs --model '$model' '$mock' &&
    ls -l &&
    ls -l workdir &&
    cp ./workdir/tfs/output.sdf '$output' &&
    head -n 1000 ./workdir/tfs/output.sdf &&

    mkdir -p ./pdb &&
    cp -r ./workdir/tfs/receptor*.pdb ./pdb &&
    tar -cvhf archiv.tar ./pdb &&
    cp archiv.tar '$output_receptors' &&

    cp ./workdir/tfs/predictions.txt '$predictions'

    ]]></command>

    <inputs>
        <param type="data" name="receptor" format="pdb" label="Receptor" help="Select a receptor (PDB format)."/>
        <param type="data" name="ligands" format="sdf,mol" label="Ligands" help="Ligands (docked poses) in SDF format)"/>
        <param name="distance" type="float" value="0" min="0" max="5.0" label="Distance to waters"
               help="Remove waters closer than this distance to any ligand heavy atom (0 means skip this process)."/>
        <param name="model" type="select" label="Model for predictions">
            <option value="weights.caffemodel">No threshold (original model)</option>
            <option value="10nm.0_iter_50000.caffemodel">10nM threshold</option>
            <option value="50nm.0_iter_50000.caffemodel">50nM threshold</option>
            <option value="200nm.0_iter_50000.caffemodel">200nM threshold</option>
        </param>
        <param type="hidden" name="mock" value="" label="Mock calculations" help="Use random numbers instead of running on GPU"/>
    </inputs>
    <outputs>
        <data name="output" format="sdf" label="XChem pose scoring on ${on_string}"/>
        <data name="predictions" format="txt" label="Predictions on ${on_string}"/>
        <data name="output_receptors" format="tar" label="Receptors ${on_string}"/>

        <!--collection name="pdb_files" type="list" label="PDB files with variable number of waters">
            <discover_datasets pattern="__name_and_ext__" directory="pdb" />
        </collection-->
    </outputs>

    <tests>
        <test>
            <param name="receptor" value="receptor.pdb"/>
            <param name="ligands" value="ligands.sdf"/>
            <param name="mock" value="--mock" />
            <param name="distance" value="4.0"/>
            <output name="output" ftype="sdf">
                <assert_contents>
                    <has_text text="TransFSReceptor"/>
                    <has_text text="TransFSScore"/>
                </assert_contents>
            </output> -->
            <!---output_collection name="pdb_files" type="list" count="2" /-->
        </test>
        <test>
            <param name="receptor" value="receptor.pdb"/>
            <param name="ligands" value="ligands.sdf"/>
            <param name="mock" value="--mock" />
            <param name="distance" value="0"/>
            <output name="output" ftype="sdf">
                <assert_contents>
                    <not_has_text text="TransFSReceptor"/>
                    <has_text text="TransFSScore"/>
                </assert_contents>
            </output>
            <!--output_collection name="pdb_files" type="list" count="1" /-->
        </test>
    </tests>
    <help><![CDATA[

.. class:: infomark

This tool performs scoring of docked ligand poses using deep learning.
It uses the gnina and libmolgrid toolkits to perform the scoring to generate
a prediction for how good the pose is.


-----

.. class:: infomark

**Inputs**

1. The protein receptor to dock into as a file in PDB format. This should have the ligand removed but can
   retain the waters. This is specified by the 'receptor' parameter.
2. A set of ligand poses to score in SDF format. This is specified by the 'ligands' parameter.

Other parameters:

'distance': Distance in Angstroms. Waters closer than this to any heavy atom in each ligand are removed.
  A discrete set of PDB files are created missing certain waters and the ligands corresponding to that PDB file are
  scored against it. If a distance of zero is specified then this process is skipped and all ligands are scored against
  the receptor 'as is'. The assumption is that all waters have already been removed in this case. Specifying a
  distance of 0 provides a significant boost in performance as the water checking step is avoided..

'model': A number of models are provided:
- weights.caffemodel - No threshold for distinction of actives and inactives (original model)
- 10nm.0_iter_50000.caffemodel: actives are molecules from DUDE that have better than 10nM activity
- 50nm.0_iter_50000.caffemodel: actives are molecules from DUDE that have better than 50nM activity
- 200nm.0_iter_50000.caffemodel: actives are molecules from DUDE that have better than 200nM activity

-----

.. class:: infomark

**Outputs**

An SDF file is produced as output. The predicted scores are contained within the SDF file
as the TransFSScore property and the PDB file (with the waters that clash with the ligand removed)
that was used for the scoring as the TransFSReceptor property.
Values for the score range from 0 (poor binding) to 1 (good binding).

The raw predictions (predictions.txt) is also provided as an output.

A set of PDB files is also output, each one with different crystallographic waters removed. Each ligand is
examined against input PDB structure and the with waters that clash (any heavy atom of the ligand closer than
the 'distance' parameter being removed. The file names are encoded with the water numbers that are removed.

    ]]></help>
</tool>
author	bgruening
date	Mon, 13 Apr 2020 10:54:08 -0400
parents	8d9c8ba2ec86
children