0
|
1 <tool id="ctb_rdkit_describtors" name="Descriptors" version="0.1">
|
|
2 <description>calculated with RDKit</description>
|
|
3 <parallelism method="multi" split_inputs="infile" split_mode="to_size" split_size="10000" shared_inputs="" merge_outputs="outfile"></parallelism>
|
|
4 <requirements>
|
|
5 <requirement type="package" version="2012_12_1">rdkit</requirement>
|
|
6 </requirements>
|
|
7 <command interpreter="python">rdkit_descriptors.py -i "${infile}" --iformat "${infile.ext}" -o "${outfile}" $header 2>&1</command>
|
|
8 <inputs>
|
|
9 <param format="smi,sdf" name="infile" type="data" label="Molecule data in SD- or SMILES-format" help="Dataset missing? See TIP below"/>
|
|
10 <param name="header" type="boolean" label="Include the descriptor name as header" truevalue="--header" falsevalue="" checked="false" />
|
|
11 </inputs>
|
|
12 <outputs>
|
|
13 <data format="tabular" name="outfile" />
|
|
14 </outputs>
|
|
15 <tests>
|
|
16 </tests>
|
|
17 <help>
|
|
18
|
|
19 .. class:: infomark
|
|
20
|
|
21 **What this tool does**
|
|
22
|
|
23 | RDKit is an open source toolkit for cheminformatics and machine learning.
|
|
24 | This implementation focuses on descriptor calculation, though, RDKit offers a vast number of other functions.
|
|
25 |
|
|
26 | The table below shows a brief overview of the descriptors.
|
|
27 |
|
|
28
|
|
29 +-----------------------------------+------------+
|
|
30 | Descriptor/Descriptor Family | Language |
|
|
31 +===================================+============+
|
|
32 | Gasteiger/Marsili Partial Charges | C++ |
|
|
33 +-----------------------------------+------------+
|
|
34 | BalabanJ | Python |
|
|
35 +-----------------------------------+------------+
|
|
36 | BertzCT | Python |
|
|
37 +-----------------------------------+------------+
|
|
38 | Ipc | Python |
|
|
39 +-----------------------------------+------------+
|
|
40 | HallKierAlpha | Python |
|
|
41 +-----------------------------------+------------+
|
|
42 | Kappa1 - Kappa3 | Python |
|
|
43 +-----------------------------------+------------+
|
|
44 | Chi0, Chi1 | Python |
|
|
45 +-----------------------------------+------------+
|
|
46 | Chi0n - Chi4n | Python |
|
|
47 +-----------------------------------+------------+
|
|
48 | Chi0v - Chi4v | Python |
|
|
49 +-----------------------------------+------------+
|
|
50 | MolLogP | C++ |
|
|
51 +-----------------------------------+------------+
|
|
52 | MolMR | C++ |
|
|
53 +-----------------------------------+------------+
|
|
54 | MolWt | C++ |
|
|
55 +-----------------------------------+------------+
|
|
56 | HeavyAtomCount | Python |
|
|
57 +-----------------------------------+------------+
|
|
58 | HeavyAtomMolWt | Python |
|
|
59 +-----------------------------------+------------+
|
|
60 | NHOHCount | C++ |
|
|
61 +-----------------------------------+------------+
|
|
62 | NOCount | C++ |
|
|
63 +-----------------------------------+------------+
|
|
64 | NumHAcceptors | C++ |
|
|
65 +-----------------------------------+------------+
|
|
66 | NumHDonors | C++ |
|
|
67 +-----------------------------------+------------+
|
|
68 | NumHeteroatoms | C++ |
|
|
69 +-----------------------------------+------------+
|
|
70 | NumRotatableBonds | C++ |
|
|
71 +-----------------------------------+------------+
|
|
72 | NumValenceElectrons | Python |
|
|
73 +-----------------------------------+------------+
|
|
74 | RingCount | C++ |
|
|
75 +-----------------------------------+------------+
|
|
76 | TPSA | C++ |
|
|
77 +-----------------------------------+------------+
|
|
78 | LabuteASA | C++ |
|
|
79 +-----------------------------------+------------+
|
|
80 | PEOE_VSA1 - PEOE_VSA14 | Python/C++ |
|
|
81 +-----------------------------------+------------+
|
|
82 | SMR_VSA1 - SMR_VSA10 | Python/C++ |
|
|
83 +-----------------------------------+------------+
|
|
84 | SlogP_VSA1 - SlogP_VSA12 | Python/C++ |
|
|
85 +-----------------------------------+------------+
|
|
86 | EState_VSA1 - EState_VSA11 | Python |
|
|
87 +-----------------------------------+------------+
|
|
88 | VSA_EState1 - VSA_EState10 | Python |
|
|
89 +-----------------------------------+------------+
|
|
90 | Topliss fragments | Python |
|
|
91 +-----------------------------------+------------+
|
|
92
|
|
93 |
|
|
94 | A full list of the descriptors can be obtained here_.
|
|
95
|
|
96 .. _here: https://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit
|
|
97
|
|
98 -----
|
|
99
|
|
100 .. class:: warningmark
|
|
101
|
|
102 **HINT**
|
|
103
|
|
104 Use the **cut columns from a table** tool to select just the desired descriptors.
|
|
105
|
|
106 -----
|
|
107
|
|
108 .. class:: infomark
|
|
109
|
|
110 **Input**
|
|
111
|
|
112 | - `SD-Format`_
|
|
113 | - `SMILES Format`_
|
|
114 | - TDT_
|
|
115 | - SLN
|
|
116 | - `Corina MOL2`_
|
|
117
|
|
118 .. _SD-Format: http://en.wikipedia.org/wiki/Chemical_table_file
|
|
119 .. _SMILES Format: http://en.wikipedia.org/wiki/Simplified_molecular_input_line_entry_specification
|
|
120 .. _TDT: https://earray.chem.agilent.com/earray/helppages/index.htm#tdt_format_files.htm
|
|
121 .. _Corina MOL2: http://www.molecular-networks.com/products/corina
|
|
122
|
|
123 -----
|
|
124
|
|
125 .. class:: infomark
|
|
126
|
|
127 **Output**
|
|
128
|
|
129 Tabularfile, where each descriptor (value) is shown in a seperate column.
|
|
130
|
|
131 -----
|
|
132
|
|
133 .. class:: informark
|
|
134
|
|
135 **Cite**
|
|
136
|
|
137 Greg Landrum - RDKit_: Open-source cheminformatics
|
|
138
|
|
139 .. _RDKit: http://www.rdkit.org
|
|
140
|
|
141
|
|
142 </help>
|
|
143 </tool>
|