| 492 | 1 # Metabolic Model Setting | 
|  | 2 | 
|  | 3 Extract and organize metabolic model components into tabular format for analysis and integration. | 
|  | 4 | 
|  | 5 ## Overview | 
|  | 6 | 
|  | 7 Metabolic Model Setting (metabolicModel2Tabular) extracts key components from SBML metabolic models and generates comprehensive tabular summaries. This tool processes built-in or custom models, applies medium constraints, handles gene nomenclature conversion, and outputs structured data for downstream analysis. | 
|  | 8 | 
|  | 9 ## Usage | 
|  | 10 | 
|  | 11 ### Command Line | 
|  | 12 | 
|  | 13 ```bash | 
|  | 14 metabolicModel2Tabular --model ENGRO2 \ | 
|  | 15                        --name ENGRO2 \ | 
|  | 16                        --medium_selector allOpen \ | 
|  | 17                        --gene_format Default \ | 
|  | 18                        --out_tabular model_data.csv \ | 
|  | 19                        --out_log extraction.log \ | 
|  | 20                        --tool_dir /path/to/COBRAxy | 
|  | 21 ``` | 
|  | 22 | 
|  | 23 ### Galaxy Interface | 
|  | 24 | 
|  | 25 Select "Metabolic Model Setting" from the COBRAxy tool suite and configure model extraction parameters. | 
|  | 26 | 
|  | 27 ## Parameters | 
|  | 28 | 
|  | 29 ### Required Parameters | 
|  | 30 | 
|  | 31 | Parameter | Flag | Description | | 
|  | 32 |-----------|------|-------------| | 
|  | 33 | Model Name | `--name` | Model identifier for output files | | 
|  | 34 | Medium Selector | `--medium_selector` | Medium configuration option | | 
|  | 35 | Output Tabular | `--out_tabular` | Output file path (CSV or XLSX) | | 
|  | 36 | Output Log | `--out_log` | Log file for processing information | | 
|  | 37 | Tool Directory | `--tool_dir` | COBRAxy installation directory | | 
|  | 38 | 
|  | 39 ### Model Selection Parameters | 
|  | 40 | 
|  | 41 | Parameter | Flag | Description | Default | | 
|  | 42 |-----------|------|-------------|---------| | 
|  | 43 | Built-in Model | `--model` | Pre-installed model (ENGRO2, Recon, HMRcore) | - | | 
|  | 44 | Custom Model | `--input` | Path to custom SBML/JSON model file | - | | 
|  | 45 | 
|  | 46 **Note**: Provide either `--model` OR `--input`, not both. | 
|  | 47 | 
|  | 48 ### Optional Parameters | 
|  | 49 | 
|  | 50 | Parameter | Flag | Description | Default | | 
|  | 51 |-----------|------|-------------|---------| | 
|  | 52 | Gene Format | `--gene_format` | Gene ID format conversion | Default | | 
|  | 53 | 
|  | 54 ## Model Selection | 
|  | 55 | 
|  | 56 ### Built-in Models | 
|  | 57 | 
|  | 58 #### ENGRO2 | 
|  | 59 - **Species**: Homo sapiens | 
|  | 60 - **Scope**: Genome-scale reconstruction | 
|  | 61 - **Reactions**: ~2,000 reactions | 
|  | 62 - **Metabolites**: ~1,500 metabolites | 
|  | 63 - **Coverage**: Comprehensive human metabolism | 
|  | 64 | 
|  | 65 #### Recon | 
|  | 66 - **Species**: Homo sapiens | 
|  | 67 - **Scope**: Recon3D human reconstruction | 
|  | 68 - **Reactions**: ~10,000+ reactions | 
|  | 69 - **Metabolites**: ~5,000+ metabolites | 
|  | 70 - **Coverage**: Most comprehensive human model | 
|  | 71 | 
|  | 72 #### HMRcore | 
|  | 73 - **Species**: Homo sapiens | 
|  | 74 - **Scope**: Core metabolic network | 
|  | 75 - **Reactions**: ~300 essential reactions | 
|  | 76 - **Metabolites**: ~200 core metabolites | 
|  | 77 - **Coverage**: Central carbon and energy metabolism | 
|  | 78 | 
|  | 79 ### Custom Models | 
|  | 80 | 
|  | 81 Supported formats for custom model import: | 
|  | 82 - **SBML**: Systems Biology Markup Language (.xml, .sbml) | 
|  | 83 - **JSON**: COBRApy JSON format (.json) | 
|  | 84 - **MAT**: MATLAB format (.mat) | 
|  | 85 - **YML**: YAML format (.yml, .yaml) | 
|  | 86 - **Compressed**: All formats support .gz, .zip, .bz2 compression | 
|  | 87 | 
|  | 88 ## Medium Configuration | 
|  | 89 | 
|  | 90 ### allOpen (Default) | 
|  | 91 - All exchange reactions unconstrained | 
|  | 92 - Maximum metabolic flexibility | 
|  | 93 - Suitable for general analysis | 
|  | 94 | 
|  | 95 ### Custom Medium | 
|  | 96 User can specify custom medium constraints through Galaxy interface or by modifying the tool configuration. | 
|  | 97 | 
|  | 98 ## Gene Format Options | 
|  | 99 | 
|  | 100 | Format | Description | Example | | 
|  | 101 |--------|-------------|---------| | 
|  | 102 | Default | Original model gene IDs | As stored in model | | 
|  | 103 | ENSNG | Ensembl Gene IDs | ENSG00000139618 | | 
|  | 104 | HGNC_SYMBOL | HUGO Gene Symbols | BRCA2 | | 
|  | 105 | HGNC_ID | HUGO Gene Committee IDs | HGNC:1101 | | 
|  | 106 | ENTREZ | NCBI Entrez Gene IDs | 675 | | 
|  | 107 | 
|  | 108 Gene format conversion uses internal mapping tables and may not cover all genes in custom models. | 
|  | 109 | 
|  | 110 ## Output Format | 
|  | 111 | 
|  | 112 ### Tabular Summary File | 
|  | 113 | 
|  | 114 The output contains comprehensive model information in CSV or XLSX format: | 
|  | 115 | 
|  | 116 #### Column Structure | 
|  | 117 ``` | 
|  | 118 Reaction_ID	GPR_Rule	Reaction_Formula	Lower_Bound	Upper_Bound	Objective_Coefficient	Medium_Member	Compartment	Subsystem | 
|  | 119 R00001	GENE1 or GENE2	A + B -> C + D	-1000.0	1000.0	0.0	FALSE	cytosol	Glycolysis | 
|  | 120 R00002	GENE3 and GENE4	E <-> F	-1000.0	1000.0	0.0	FALSE	mitochondria	TCA_Cycle | 
|  | 121 EX_glc_e	-	glc_e <->	-1000.0	1000.0	0.0	TRUE	extracellular	Exchange | 
|  | 122 ``` | 
|  | 123 | 
|  | 124 #### Data Fields | 
|  | 125 | 
|  | 126 | Field | Description | Values | | 
|  | 127 |-------|-------------|---------| | 
|  | 128 | Reaction_ID | Unique reaction identifier | String | | 
|  | 129 | GPR_Rule | Gene-protein-reaction association | Logical expression | | 
|  | 130 | Reaction_Formula | Stoichiometric equation | Metabolites with coefficients | | 
|  | 131 | Lower_Bound | Minimum flux constraint | Numeric (typically -1000) | | 
|  | 132 | Upper_Bound | Maximum flux constraint | Numeric (typically 1000) | | 
|  | 133 | Objective_Coefficient | Biomass/objective weight | Numeric (0 or 1) | | 
|  | 134 | Medium_Member | Exchange reaction flag | TRUE/FALSE | | 
|  | 135 | Compartment | Subcellular location | String (for ENGRO2 only) | | 
|  | 136 | Subsystem | Metabolic pathway | String | | 
|  | 137 | 
|  | 138 ## Examples | 
|  | 139 | 
|  | 140 ### Extract Built-in Model Data | 
|  | 141 | 
|  | 142 ```bash | 
|  | 143 # Extract ENGRO2 model with default settings | 
|  | 144 metabolicModel2Tabular --model ENGRO2 \ | 
|  | 145                        --name ENGRO2_extraction \ | 
|  | 146                        --medium_selector allOpen \ | 
|  | 147                        --gene_format Default \ | 
|  | 148                        --out_tabular ENGRO2_data.csv \ | 
|  | 149                        --out_log ENGRO2_log.txt \ | 
|  | 150                        --tool_dir /opt/COBRAxy | 
|  | 151 ``` | 
|  | 152 | 
|  | 153 ### Process Custom Model | 
|  | 154 | 
|  | 155 ```bash | 
|  | 156 # Extract custom SBML model with gene conversion | 
|  | 157 metabolicModel2Tabular --input /data/custom_model.xml \ | 
|  | 158                        --name CustomModel \ | 
|  | 159                        --medium_selector allOpen \ | 
|  | 160                        --gene_format HGNC_SYMBOL \ | 
|  | 161                        --out_tabular custom_model_data.xlsx \ | 
|  | 162                        --out_log custom_extraction.log \ | 
|  | 163                        --tool_dir /opt/COBRAxy | 
|  | 164 ``` | 
|  | 165 | 
|  | 166 ### Extract Core Model for Quick Analysis | 
|  | 167 | 
|  | 168 ```bash | 
|  | 169 # Extract HMRcore for rapid prototyping | 
|  | 170 metabolicModel2Tabular --model HMRcore \ | 
|  | 171                        --name CoreModel \ | 
|  | 172                        --medium_selector allOpen \ | 
|  | 173                        --gene_format ENSNG \ | 
|  | 174                        --out_tabular core_reactions.csv \ | 
|  | 175                        --out_log core_log.txt \ | 
|  | 176                        --tool_dir /opt/COBRAxy | 
|  | 177 ``` | 
|  | 178 | 
|  | 179 ### Batch Processing Multiple Models | 
|  | 180 | 
|  | 181 ```bash | 
|  | 182 #!/bin/bash | 
|  | 183 models=("ENGRO2" "HMRcore" "Recon") | 
|  | 184 for model in "${models[@]}"; do | 
|  | 185     metabolicModel2Tabular --model "$model" \ | 
|  | 186                            --name "${model}_extract" \ | 
|  | 187                            --medium_selector allOpen \ | 
|  | 188                            --gene_format HGNC_SYMBOL \ | 
|  | 189                            --out_tabular "${model}_data.csv" \ | 
|  | 190                            --out_log "${model}_log.txt" \ | 
|  | 191                            --tool_dir /opt/COBRAxy | 
|  | 192 done | 
|  | 193 ``` | 
|  | 194 | 
|  | 195 ## Use Cases | 
|  | 196 | 
|  | 197 ### Model Comparison | 
|  | 198 Extract multiple models to compare: | 
|  | 199 - Reaction coverage across different reconstructions | 
|  | 200 - Gene-reaction associations | 
|  | 201 - Pathway representation | 
|  | 202 - Metabolite compartmentalization | 
|  | 203 | 
|  | 204 ### Data Integration | 
|  | 205 Prepare model data for: | 
|  | 206 - Custom analysis pipelines | 
|  | 207 - Database integration | 
|  | 208 - Pathway annotation | 
|  | 209 - Cross-reference mapping | 
|  | 210 | 
|  | 211 ### Quality Control | 
|  | 212 Validate model properties: | 
|  | 213 - Check reaction balancing | 
|  | 214 - Verify gene associations | 
|  | 215 - Assess network connectivity | 
|  | 216 - Identify missing annotations | 
|  | 217 | 
|  | 218 ### Custom Analysis | 
|  | 219 Export structured data for: | 
|  | 220 - Network analysis (graph theory) | 
|  | 221 - Machine learning applications | 
|  | 222 - Statistical modeling | 
|  | 223 - Comparative genomics | 
|  | 224 | 
|  | 225 ## Integration Workflow | 
|  | 226 | 
|  | 227 ### Downstream Tools | 
|  | 228 | 
|  | 229 The extracted tabular data serves as input for: | 
|  | 230 | 
|  | 231 #### COBRAxy Tools | 
|  | 232 - [RAS Generator](ras-generator.md) - Use extracted GPR rules | 
|  | 233 - [RPS Generator](rps-generator.md) - Use reaction formulas | 
|  | 234 - [RAS to Bounds](ras-to-bounds.md) - Use reaction bounds | 
|  | 235 - [MAREA](marea.md) - Use reaction annotations | 
|  | 236 | 
|  | 237 #### External Analysis | 
|  | 238 - **R/Bioconductor**: Import CSV for pathway analysis | 
|  | 239 - **Python/pandas**: Load data for network analysis | 
|  | 240 - **MATLAB**: Process XLSX for modeling | 
|  | 241 - **Cytoscape**: Network visualization | 
|  | 242 - **Databases**: Populate reaction databases | 
|  | 243 | 
|  | 244 ### Typical Pipeline | 
|  | 245 | 
|  | 246 ```bash | 
|  | 247 # 1. Extract model components | 
|  | 248 metabolicModel2Tabular --model ENGRO2 --name ModelData \ | 
|  | 249                        --out_tabular model_components.csv | 
|  | 250 | 
|  | 251 # 2. Use extracted data for RAS analysis | 
|  | 252 ras_generator -td /opt/COBRAxy -rs Custom \ | 
|  | 253               -rl model_components.csv \ | 
|  | 254               -in expression_data.tsv -ra ras_scores.tsv | 
|  | 255 | 
|  | 256 # 3. Apply constraints and sample fluxes | 
|  | 257 ras_to_bounds -td /opt/COBRAxy -ms Custom -mo model_components.csv \ | 
|  | 258               -ir ras_scores.tsv -idop constrained_bounds/ | 
|  | 259 | 
|  | 260 # 4. Visualize results | 
|  | 261 marea -td /opt/COBRAxy -input_data ras_scores.tsv \ | 
|  | 262       -choice_map Custom -custom_map custom.svg -idop results/ | 
|  | 263 ``` | 
|  | 264 | 
|  | 265 ## Quality Control | 
|  | 266 | 
|  | 267 ### Pre-extraction Validation | 
|  | 268 - Verify model file integrity and format | 
|  | 269 - Check SBML compliance for custom models | 
|  | 270 - Validate gene ID formats and coverage | 
|  | 271 - Confirm medium constraint specifications | 
|  | 272 | 
|  | 273 ### Post-extraction Checks | 
|  | 274 - **Completeness**: Verify all expected reactions extracted | 
|  | 275 - **Consistency**: Check stoichiometric balance | 
|  | 276 - **Annotations**: Validate gene-reaction associations | 
|  | 277 - **Formatting**: Confirm output file structure | 
|  | 278 | 
|  | 279 ### Data Validation | 
|  | 280 | 
|  | 281 #### Reaction Balancing | 
|  | 282 ```bash | 
|  | 283 # Check for unbalanced reactions | 
|  | 284 awk -F'\t' 'NR>1 && $3 !~ /\<->\|->/ {print $1, $3}' model_data.csv | 
|  | 285 ``` | 
|  | 286 | 
|  | 287 #### Gene Coverage | 
|  | 288 ```bash | 
|  | 289 # Count reactions with GPR rules | 
|  | 290 awk -F'\t' 'NR>1 && $2 != "" {count++} END {print count " reactions with GPR"}' model_data.csv | 
|  | 291 ``` | 
|  | 292 | 
|  | 293 #### Exchange Reactions | 
|  | 294 ```bash | 
|  | 295 # List medium components | 
|  | 296 awk -F'\t' 'NR>1 && $7 == "TRUE" {print $1}' model_data.csv | 
|  | 297 ``` | 
|  | 298 | 
|  | 299 ## Tips and Best Practices | 
|  | 300 | 
|  | 301 ### Model Selection | 
|  | 302 - **ENGRO2**: Balanced coverage for human tissue analysis | 
|  | 303 - **HMRcore**: Fast processing for algorithm development | 
|  | 304 - **Recon**: Comprehensive analysis requiring computational resources | 
|  | 305 - **Custom**: Organism-specific or specialized models | 
|  | 306 | 
|  | 307 ### Gene Format Selection | 
|  | 308 - **Default**: Preserve original model annotations | 
|  | 309 - **HGNC_SYMBOL**: Human-readable gene names | 
|  | 310 - **ENSNG**: Stable identifiers for bioinformatics | 
|  | 311 - **ENTREZ**: Cross-database compatibility | 
|  | 312 | 
|  | 313 ### Output Format Optimization | 
|  | 314 - **CSV**: Lightweight, universal compatibility | 
|  | 315 - **XLSX**: Rich formatting, multiple sheets possible | 
|  | 316 - Choose based on downstream analysis requirements | 
|  | 317 | 
|  | 318 ### Performance Considerations | 
|  | 319 - Large models (Recon) may require substantial memory | 
|  | 320 - Gene format conversion adds processing time | 
|  | 321 - Consider batch processing for multiple extractions | 
|  | 322 | 
|  | 323 ## Troubleshooting | 
|  | 324 | 
|  | 325 ### Common Issues | 
|  | 326 | 
|  | 327 **Model loading fails** | 
|  | 328 - Check file format and compression | 
|  | 329 - Verify SBML validity for custom models | 
|  | 330 - Ensure sufficient system memory | 
|  | 331 | 
|  | 332 **Gene format conversion errors** | 
|  | 333 - Mapping tables may not cover all genes | 
|  | 334 - Original gene IDs retained when conversion fails | 
|  | 335 - Check log file for conversion statistics | 
|  | 336 | 
|  | 337 **Empty output file** | 
|  | 338 - Model may contain no reactions | 
|  | 339 - Check model file integrity | 
|  | 340 - Verify tool directory configuration | 
|  | 341 | 
|  | 342 ### Error Messages | 
|  | 343 | 
|  | 344 | Error | Cause | Solution | | 
|  | 345 |-------|-------|----------| | 
|  | 346 | "Model file not found" | Invalid file path | Check file location and permissions | | 
|  | 347 | "Unsupported format" | Invalid model format | Use SBML, JSON, MAT, or YML | | 
|  | 348 | "Gene mapping failed" | Missing gene conversion data | Use Default format or update mappings | | 
|  | 349 | "Memory allocation error" | Insufficient system memory | Use smaller model or increase memory | | 
|  | 350 | 
|  | 351 ### Performance Issues | 
|  | 352 | 
|  | 353 **Slow processing** | 
|  | 354 - Large models require more time | 
|  | 355 - Gene conversion adds overhead | 
|  | 356 - Monitor system resource usage | 
|  | 357 | 
|  | 358 **Memory errors** | 
|  | 359 - Reduce model size if possible | 
|  | 360 - Process in smaller batches | 
|  | 361 - Increase available system memory | 
|  | 362 | 
|  | 363 **Output file corruption** | 
|  | 364 - Check disk space availability | 
|  | 365 - Verify file write permissions | 
|  | 366 - Monitor for system interruptions | 
|  | 367 | 
|  | 368 ## Advanced Usage | 
|  | 369 | 
|  | 370 ### Custom Gene Mapping | 
|  | 371 | 
|  | 372 Advanced users can extend gene format conversion by modifying mapping files in the `local/mappings/` directory. | 
|  | 373 | 
|  | 374 ### Batch Extraction Script | 
|  | 375 | 
|  | 376 ```python | 
|  | 377 #!/usr/bin env python3 | 
|  | 378 import subprocess | 
|  | 379 import sys | 
|  | 380 | 
|  | 381 models = ['ENGRO2', 'HMRcore', 'Recon'] | 
|  | 382 formats = ['Default', 'HGNC_SYMBOL', 'ENSNG'] | 
|  | 383 | 
|  | 384 for model in models: | 
|  | 385     for fmt in formats: | 
|  | 386         cmd = [ | 
|  | 387             'metabolicModel2Tabular', | 
|  | 388             '--model', model, | 
|  | 389             '--name', f'{model}_{fmt}', | 
|  | 390             '--medium_selector', 'allOpen', | 
|  | 391             '--gene_format', fmt, | 
|  | 392             '--out_tabular', f'{model}_{fmt}.csv', | 
|  | 393             '--out_log', f'{model}_{fmt}.log', | 
|  | 394             '--tool_dir', '/opt/COBRAxy' | 
|  | 395         ] | 
|  | 396         subprocess.run(cmd, check=True) | 
|  | 397 ``` | 
|  | 398 | 
|  | 399 ### Database Integration | 
|  | 400 | 
|  | 401 Export model data to databases: | 
|  | 402 | 
|  | 403 ```sql | 
|  | 404 -- Load CSV into PostgreSQL | 
|  | 405 CREATE TABLE model_reactions ( | 
|  | 406     reaction_id VARCHAR(50), | 
|  | 407     gpr_rule TEXT, | 
|  | 408     reaction_formula TEXT, | 
|  | 409     lower_bound FLOAT, | 
|  | 410     upper_bound FLOAT, | 
|  | 411     objective_coefficient FLOAT, | 
|  | 412     medium_member BOOLEAN, | 
|  | 413     compartment VARCHAR(50), | 
|  | 414     subsystem VARCHAR(100) | 
|  | 415 ); | 
|  | 416 | 
|  | 417 COPY model_reactions FROM 'model_data.csv' WITH CSV HEADER; | 
|  | 418 ``` | 
|  | 419 | 
|  | 420 ## See Also | 
|  | 421 | 
|  | 422 - [RAS Generator](ras-generator.md) - Use extracted GPR rules for RAS computation | 
|  | 423 - [RPS Generator](rps-generator.md) - Use reaction formulas for RPS analysis | 
|  | 424 - [Custom Model Tutorial](../tutorials/custom-model-integration.md) | 
|  | 425 - [Gene Mapping Reference](../tutorials/gene-id-conversion.md) |