annotate COBRAxy/docs/tools/metabolic-model-setting.md @ 509:5956dcf94277 draft default tip

Uploaded
author francesco_lapi
date Wed, 01 Oct 2025 15:34:21 +0000
parents 4ed95023af20
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
1 # Metabolic Model Setting
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
3 Extract and organize metabolic model components into tabular format for analysis and integration.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
5 ## Overview
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
6
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
7 Metabolic Model Setting (metabolicModel2Tabular) extracts key components from SBML metabolic models and generates comprehensive tabular summaries. This tool processes built-in or custom models, applies medium constraints, handles gene nomenclature conversion, and outputs structured data for downstream analysis.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
9 ## Usage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
10
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
11 ### Command Line
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
12
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
13 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
14 metabolicModel2Tabular --model ENGRO2 \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
15 --name ENGRO2 \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
16 --medium_selector allOpen \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
17 --gene_format Default \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
18 --out_tabular model_data.csv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
19 --out_log extraction.log \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
20 --tool_dir /path/to/COBRAxy
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
21 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
22
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
23 ### Galaxy Interface
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
24
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
25 Select "Metabolic Model Setting" from the COBRAxy tool suite and configure model extraction parameters.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
26
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
27 ## Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
28
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
29 ### Required Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
30
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
31 | Parameter | Flag | Description |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
32 |-----------|------|-------------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
33 | Model Name | `--name` | Model identifier for output files |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
34 | Medium Selector | `--medium_selector` | Medium configuration option |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
35 | Output Tabular | `--out_tabular` | Output file path (CSV or XLSX) |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
36 | Output Log | `--out_log` | Log file for processing information |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
37 | Tool Directory | `--tool_dir` | COBRAxy installation directory |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
38
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
39 ### Model Selection Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
40
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
41 | Parameter | Flag | Description | Default |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
42 |-----------|------|-------------|---------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
43 | Built-in Model | `--model` | Pre-installed model (ENGRO2, Recon, HMRcore) | - |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
44 | Custom Model | `--input` | Path to custom SBML/JSON model file | - |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
45
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
46 **Note**: Provide either `--model` OR `--input`, not both.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
47
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
48 ### Optional Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
49
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
50 | Parameter | Flag | Description | Default |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
51 |-----------|------|-------------|---------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
52 | Gene Format | `--gene_format` | Gene ID format conversion | Default |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
53
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
54 ## Model Selection
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
55
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
56 ### Built-in Models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
57
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
58 #### ENGRO2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
59 - **Species**: Homo sapiens
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
60 - **Scope**: Genome-scale reconstruction
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
61 - **Reactions**: ~2,000 reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
62 - **Metabolites**: ~1,500 metabolites
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
63 - **Coverage**: Comprehensive human metabolism
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
64
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
65 #### Recon
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
66 - **Species**: Homo sapiens
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
67 - **Scope**: Recon3D human reconstruction
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
68 - **Reactions**: ~10,000+ reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
69 - **Metabolites**: ~5,000+ metabolites
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
70 - **Coverage**: Most comprehensive human model
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
71
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
72 #### HMRcore
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
73 - **Species**: Homo sapiens
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
74 - **Scope**: Core metabolic network
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
75 - **Reactions**: ~300 essential reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
76 - **Metabolites**: ~200 core metabolites
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
77 - **Coverage**: Central carbon and energy metabolism
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
78
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
79 ### Custom Models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
80
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
81 Supported formats for custom model import:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
82 - **SBML**: Systems Biology Markup Language (.xml, .sbml)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
83 - **JSON**: COBRApy JSON format (.json)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
84 - **MAT**: MATLAB format (.mat)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
85 - **YML**: YAML format (.yml, .yaml)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
86 - **Compressed**: All formats support .gz, .zip, .bz2 compression
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
87
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
88 ## Medium Configuration
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
89
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
90 ### allOpen (Default)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
91 - All exchange reactions unconstrained
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
92 - Maximum metabolic flexibility
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
93 - Suitable for general analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
94
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
95 ### Custom Medium
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
96 User can specify custom medium constraints through Galaxy interface or by modifying the tool configuration.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
97
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
98 ## Gene Format Options
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
99
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
100 | Format | Description | Example |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
101 |--------|-------------|---------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
102 | Default | Original model gene IDs | As stored in model |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
103 | ENSNG | Ensembl Gene IDs | ENSG00000139618 |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
104 | HGNC_SYMBOL | HUGO Gene Symbols | BRCA2 |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
105 | HGNC_ID | HUGO Gene Committee IDs | HGNC:1101 |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
106 | ENTREZ | NCBI Entrez Gene IDs | 675 |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
107
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
108 Gene format conversion uses internal mapping tables and may not cover all genes in custom models.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
109
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
110 ## Output Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
111
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
112 ### Tabular Summary File
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
113
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
114 The output contains comprehensive model information in CSV or XLSX format:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
115
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
116 #### Column Structure
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
117 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
118 Reaction_ID GPR_Rule Reaction_Formula Lower_Bound Upper_Bound Objective_Coefficient Medium_Member Compartment Subsystem
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
119 R00001 GENE1 or GENE2 A + B -> C + D -1000.0 1000.0 0.0 FALSE cytosol Glycolysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
120 R00002 GENE3 and GENE4 E <-> F -1000.0 1000.0 0.0 FALSE mitochondria TCA_Cycle
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
121 EX_glc_e - glc_e <-> -1000.0 1000.0 0.0 TRUE extracellular Exchange
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
122 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
123
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
124 #### Data Fields
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
125
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
126 | Field | Description | Values |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
127 |-------|-------------|---------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
128 | Reaction_ID | Unique reaction identifier | String |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
129 | GPR_Rule | Gene-protein-reaction association | Logical expression |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
130 | Reaction_Formula | Stoichiometric equation | Metabolites with coefficients |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
131 | Lower_Bound | Minimum flux constraint | Numeric (typically -1000) |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
132 | Upper_Bound | Maximum flux constraint | Numeric (typically 1000) |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
133 | Objective_Coefficient | Biomass/objective weight | Numeric (0 or 1) |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
134 | Medium_Member | Exchange reaction flag | TRUE/FALSE |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
135 | Compartment | Subcellular location | String (for ENGRO2 only) |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
136 | Subsystem | Metabolic pathway | String |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
137
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
138 ## Examples
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
139
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
140 ### Extract Built-in Model Data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
141
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
142 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
143 # Extract ENGRO2 model with default settings
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
144 metabolicModel2Tabular --model ENGRO2 \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
145 --name ENGRO2_extraction \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
146 --medium_selector allOpen \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
147 --gene_format Default \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
148 --out_tabular ENGRO2_data.csv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
149 --out_log ENGRO2_log.txt \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
150 --tool_dir /opt/COBRAxy
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
151 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
152
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
153 ### Process Custom Model
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
154
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
155 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
156 # Extract custom SBML model with gene conversion
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
157 metabolicModel2Tabular --input /data/custom_model.xml \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
158 --name CustomModel \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
159 --medium_selector allOpen \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
160 --gene_format HGNC_SYMBOL \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
161 --out_tabular custom_model_data.xlsx \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
162 --out_log custom_extraction.log \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
163 --tool_dir /opt/COBRAxy
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
164 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
165
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
166 ### Extract Core Model for Quick Analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
167
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
168 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
169 # Extract HMRcore for rapid prototyping
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
170 metabolicModel2Tabular --model HMRcore \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
171 --name CoreModel \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
172 --medium_selector allOpen \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
173 --gene_format ENSNG \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
174 --out_tabular core_reactions.csv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
175 --out_log core_log.txt \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
176 --tool_dir /opt/COBRAxy
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
177 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
178
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
179 ### Batch Processing Multiple Models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
180
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
181 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
182 #!/bin/bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
183 models=("ENGRO2" "HMRcore" "Recon")
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
184 for model in "${models[@]}"; do
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
185 metabolicModel2Tabular --model "$model" \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
186 --name "${model}_extract" \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
187 --medium_selector allOpen \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
188 --gene_format HGNC_SYMBOL \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
189 --out_tabular "${model}_data.csv" \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
190 --out_log "${model}_log.txt" \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
191 --tool_dir /opt/COBRAxy
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
192 done
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
193 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
194
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
195 ## Use Cases
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
196
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
197 ### Model Comparison
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
198 Extract multiple models to compare:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
199 - Reaction coverage across different reconstructions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
200 - Gene-reaction associations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
201 - Pathway representation
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
202 - Metabolite compartmentalization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
203
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
204 ### Data Integration
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
205 Prepare model data for:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
206 - Custom analysis pipelines
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
207 - Database integration
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
208 - Pathway annotation
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
209 - Cross-reference mapping
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
210
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
211 ### Quality Control
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
212 Validate model properties:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
213 - Check reaction balancing
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
214 - Verify gene associations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
215 - Assess network connectivity
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
216 - Identify missing annotations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
217
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
218 ### Custom Analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
219 Export structured data for:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
220 - Network analysis (graph theory)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
221 - Machine learning applications
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
222 - Statistical modeling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
223 - Comparative genomics
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
224
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
225 ## Integration Workflow
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
226
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
227 ### Downstream Tools
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
228
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
229 The extracted tabular data serves as input for:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
230
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
231 #### COBRAxy Tools
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
232 - [RAS Generator](ras-generator.md) - Use extracted GPR rules
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
233 - [RPS Generator](rps-generator.md) - Use reaction formulas
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
234 - [RAS to Bounds](ras-to-bounds.md) - Use reaction bounds
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
235 - [MAREA](marea.md) - Use reaction annotations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
236
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
237 #### External Analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
238 - **R/Bioconductor**: Import CSV for pathway analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
239 - **Python/pandas**: Load data for network analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
240 - **MATLAB**: Process XLSX for modeling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
241 - **Cytoscape**: Network visualization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
242 - **Databases**: Populate reaction databases
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
243
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
244 ### Typical Pipeline
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
245
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
246 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
247 # 1. Extract model components
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
248 metabolicModel2Tabular --model ENGRO2 --name ModelData \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
249 --out_tabular model_components.csv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
250
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
251 # 2. Use extracted data for RAS analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
252 ras_generator -td /opt/COBRAxy -rs Custom \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
253 -rl model_components.csv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
254 -in expression_data.tsv -ra ras_scores.tsv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
255
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
256 # 3. Apply constraints and sample fluxes
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
257 ras_to_bounds -td /opt/COBRAxy -ms Custom -mo model_components.csv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
258 -ir ras_scores.tsv -idop constrained_bounds/
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
259
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
260 # 4. Visualize results
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
261 marea -td /opt/COBRAxy -input_data ras_scores.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
262 -choice_map Custom -custom_map custom.svg -idop results/
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
263 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
264
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
265 ## Quality Control
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
266
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
267 ### Pre-extraction Validation
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
268 - Verify model file integrity and format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
269 - Check SBML compliance for custom models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
270 - Validate gene ID formats and coverage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
271 - Confirm medium constraint specifications
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
272
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
273 ### Post-extraction Checks
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
274 - **Completeness**: Verify all expected reactions extracted
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
275 - **Consistency**: Check stoichiometric balance
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
276 - **Annotations**: Validate gene-reaction associations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
277 - **Formatting**: Confirm output file structure
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
278
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
279 ### Data Validation
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
280
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
281 #### Reaction Balancing
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
282 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
283 # Check for unbalanced reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
284 awk -F'\t' 'NR>1 && $3 !~ /\<->\|->/ {print $1, $3}' model_data.csv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
285 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
286
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
287 #### Gene Coverage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
288 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
289 # Count reactions with GPR rules
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
290 awk -F'\t' 'NR>1 && $2 != "" {count++} END {print count " reactions with GPR"}' model_data.csv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
291 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
292
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
293 #### Exchange Reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
294 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
295 # List medium components
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
296 awk -F'\t' 'NR>1 && $7 == "TRUE" {print $1}' model_data.csv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
297 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
298
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
299 ## Tips and Best Practices
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
300
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
301 ### Model Selection
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
302 - **ENGRO2**: Balanced coverage for human tissue analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
303 - **HMRcore**: Fast processing for algorithm development
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
304 - **Recon**: Comprehensive analysis requiring computational resources
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
305 - **Custom**: Organism-specific or specialized models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
306
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
307 ### Gene Format Selection
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
308 - **Default**: Preserve original model annotations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
309 - **HGNC_SYMBOL**: Human-readable gene names
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
310 - **ENSNG**: Stable identifiers for bioinformatics
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
311 - **ENTREZ**: Cross-database compatibility
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
312
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
313 ### Output Format Optimization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
314 - **CSV**: Lightweight, universal compatibility
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
315 - **XLSX**: Rich formatting, multiple sheets possible
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
316 - Choose based on downstream analysis requirements
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
317
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
318 ### Performance Considerations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
319 - Large models (Recon) may require substantial memory
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
320 - Gene format conversion adds processing time
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
321 - Consider batch processing for multiple extractions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
322
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
323 ## Troubleshooting
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
324
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
325 ### Common Issues
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
326
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
327 **Model loading fails**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
328 - Check file format and compression
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
329 - Verify SBML validity for custom models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
330 - Ensure sufficient system memory
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
331
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
332 **Gene format conversion errors**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
333 - Mapping tables may not cover all genes
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
334 - Original gene IDs retained when conversion fails
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
335 - Check log file for conversion statistics
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
336
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
337 **Empty output file**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
338 - Model may contain no reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
339 - Check model file integrity
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
340 - Verify tool directory configuration
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
341
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
342 ### Error Messages
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
343
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
344 | Error | Cause | Solution |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
345 |-------|-------|----------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
346 | "Model file not found" | Invalid file path | Check file location and permissions |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
347 | "Unsupported format" | Invalid model format | Use SBML, JSON, MAT, or YML |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
348 | "Gene mapping failed" | Missing gene conversion data | Use Default format or update mappings |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
349 | "Memory allocation error" | Insufficient system memory | Use smaller model or increase memory |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
350
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
351 ### Performance Issues
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
352
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
353 **Slow processing**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
354 - Large models require more time
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
355 - Gene conversion adds overhead
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
356 - Monitor system resource usage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
357
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
358 **Memory errors**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
359 - Reduce model size if possible
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
360 - Process in smaller batches
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
361 - Increase available system memory
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
362
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
363 **Output file corruption**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
364 - Check disk space availability
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
365 - Verify file write permissions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
366 - Monitor for system interruptions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
367
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
368 ## Advanced Usage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
369
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
370 ### Custom Gene Mapping
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
371
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
372 Advanced users can extend gene format conversion by modifying mapping files in the `local/mappings/` directory.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
373
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
374 ### Batch Extraction Script
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
375
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
376 ```python
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
377 #!/usr/bin env python3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
378 import subprocess
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
379 import sys
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
380
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
381 models = ['ENGRO2', 'HMRcore', 'Recon']
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
382 formats = ['Default', 'HGNC_SYMBOL', 'ENSNG']
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
383
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
384 for model in models:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
385 for fmt in formats:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
386 cmd = [
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
387 'metabolicModel2Tabular',
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
388 '--model', model,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
389 '--name', f'{model}_{fmt}',
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
390 '--medium_selector', 'allOpen',
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
391 '--gene_format', fmt,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
392 '--out_tabular', f'{model}_{fmt}.csv',
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
393 '--out_log', f'{model}_{fmt}.log',
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
394 '--tool_dir', '/opt/COBRAxy'
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
395 ]
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
396 subprocess.run(cmd, check=True)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
397 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
398
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
399 ### Database Integration
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
400
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
401 Export model data to databases:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
402
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
403 ```sql
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
404 -- Load CSV into PostgreSQL
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
405 CREATE TABLE model_reactions (
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
406 reaction_id VARCHAR(50),
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
407 gpr_rule TEXT,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
408 reaction_formula TEXT,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
409 lower_bound FLOAT,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
410 upper_bound FLOAT,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
411 objective_coefficient FLOAT,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
412 medium_member BOOLEAN,
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
413 compartment VARCHAR(50),
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
414 subsystem VARCHAR(100)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
415 );
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
416
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
417 COPY model_reactions FROM 'model_data.csv' WITH CSV HEADER;
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
418 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
419
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
420 ## See Also
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
421
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
422 - [RAS Generator](ras-generator.md) - Use extracted GPR rules for RAS computation
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
423 - [RPS Generator](rps-generator.md) - Use reaction formulas for RPS analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
424 - [Custom Model Tutorial](../tutorials/custom-model-integration.md)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
425 - [Gene Mapping Reference](../tutorials/gene-id-conversion.md)