view rdkit_descriptors.xml @ 0:6bb56f3a8de0 draft default tip

Uploaded
author bgruening
date Thu, 15 Aug 2013 03:35:10 -0400
parents
children
line wrap: on
line source

<tool id="ctb_rdkit_describtors" name="Descriptors" version="0.1">
    <description>calculated with RDKit</description>
    <parallelism method="multi" split_inputs="infile" split_mode="to_size" split_size="10000" shared_inputs="" merge_outputs="outfile"></parallelism>
    <requirements>
        <requirement type="package" version="2012_12_1">rdkit</requirement>
    </requirements>
    <command interpreter="python">rdkit_descriptors.py -i "${infile}" --iformat "${infile.ext}" -o "${outfile}" $header 2>&#38;1</command>
    <inputs>
        <param format="smi,sdf" name="infile" type="data" label="Molecule data in SD- or SMILES-format" help="Dataset missing? See TIP below"/>
        <param name="header" type="boolean" label="Include the descriptor name as header" truevalue="--header" falsevalue="" checked="false" />
    </inputs>
    <outputs>
        <data format="tabular" name="outfile" />
    </outputs>
    <tests>
    </tests>
    <help>

.. class:: infomark

**What this tool does** 

| RDKit is an open source toolkit for cheminformatics and machine learning.
| This implementation focuses on descriptor calculation, though, RDKit offers a vast number of other functions.
| 
| The table below shows a brief overview of the descriptors.
| 

+-----------------------------------+------------+
|    Descriptor/Descriptor Family   |  Language  | 
+===================================+============+ 
| Gasteiger/Marsili Partial Charges |     C++    |
+-----------------------------------+------------+
|            BalabanJ               |   Python   |
+-----------------------------------+------------+
|             BertzCT               |   Python   |
+-----------------------------------+------------+
|               Ipc                 |   Python   |
+-----------------------------------+------------+
|          HallKierAlpha            |   Python   |
+-----------------------------------+------------+
|         Kappa1 - Kappa3           |   Python   |
+-----------------------------------+------------+
|            Chi0, Chi1             |   Python   |
+-----------------------------------+------------+
|           Chi0n - Chi4n           |   Python   |
+-----------------------------------+------------+
|           Chi0v - Chi4v           |   Python   |
+-----------------------------------+------------+
|              MolLogP              |     C++    |
+-----------------------------------+------------+
|               MolMR               |     C++    |
+-----------------------------------+------------+
|               MolWt               |     C++    |
+-----------------------------------+------------+
|           HeavyAtomCount          |   Python   |
+-----------------------------------+------------+
|           HeavyAtomMolWt          |   Python   |
+-----------------------------------+------------+
|             NHOHCount             |     C++    |
+-----------------------------------+------------+
|              NOCount              |     C++    |
+-----------------------------------+------------+
|            NumHAcceptors          |     C++    |
+-----------------------------------+------------+
|             NumHDonors            |     C++    |
+-----------------------------------+------------+
|            NumHeteroatoms         |     C++    |
+-----------------------------------+------------+
|          NumRotatableBonds        |     C++    |
+-----------------------------------+------------+
|         NumValenceElectrons       |   Python   |
+-----------------------------------+------------+
|              RingCount            |     C++    |
+-----------------------------------+------------+
|                 TPSA              |     C++    |
+-----------------------------------+------------+
|              LabuteASA            |     C++    |
+-----------------------------------+------------+
|       PEOE_VSA1 - PEOE_VSA14      | Python/C++ |
+-----------------------------------+------------+
|         SMR_VSA1 - SMR_VSA10      | Python/C++ |
+-----------------------------------+------------+
|      SlogP_VSA1 - SlogP_VSA12     | Python/C++ |
+-----------------------------------+------------+
|     EState_VSA1 - EState_VSA11    |   Python   |
+-----------------------------------+------------+
|     VSA_EState1 - VSA_EState10    |   Python   |
+-----------------------------------+------------+
|           Topliss fragments       |   Python   |
+-----------------------------------+------------+

| 
| A full list of the descriptors can be obtained here_.

.. _here: https://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit

-----

.. class:: warningmark

**HINT**

Use the **cut columns from a table** tool to select just the desired descriptors.

-----

.. class:: infomark

**Input**

| - `SD-Format`_
| - `SMILES Format`_
| - TDT_
| - SLN
| - `Corina MOL2`_

.. _SD-Format: http://en.wikipedia.org/wiki/Chemical_table_file
.. _SMILES Format: http://en.wikipedia.org/wiki/Simplified_molecular_input_line_entry_specification
.. _TDT: https://earray.chem.agilent.com/earray/helppages/index.htm#tdt_format_files.htm
.. _Corina MOL2: http://www.molecular-networks.com/products/corina

-----

.. class:: infomark

 **Output**

Tabularfile, where each descriptor (value) is shown in a seperate column.

-----

.. class:: informark

**Cite**

Greg Landrum - RDKit_: Open-source cheminformatics

.. _RDKit: http://www.rdkit.org


    </help>
</tool>