492
|
1 # Flux Simulation
|
|
2
|
|
3 Sample metabolic fluxes using constraint-based modeling with CBS or OPTGP algorithms.
|
|
4
|
|
5 ## Overview
|
|
6
|
|
7 Flux Simulation performs constraint-based sampling of metabolic flux distributions from constrained models. It supports two sampling algorithms (CBS and OPTGP) and provides comprehensive flux statistics including mean, median, quantiles, pFBA, FVA, and sensitivity analysis.
|
|
8
|
|
9 ## Usage
|
|
10
|
|
11 ### Command Line
|
|
12
|
|
13 ```bash
|
|
14 flux_simulation -td /path/to/COBRAxy \
|
|
15 -ms ENGRO2 \
|
|
16 -in bounds1.tsv,bounds2.tsv \
|
|
17 -ni Sample1,Sample2 \
|
|
18 -a CBS \
|
|
19 -ns 1000 \
|
|
20 -nb 1 \
|
|
21 -sd 42 \
|
|
22 -ot mean,median,quantiles \
|
|
23 -ota pFBA,FVA,sensitivity \
|
|
24 -idop flux_results/
|
|
25 ```
|
|
26
|
|
27 ### Galaxy Interface
|
|
28
|
|
29 Select "Flux Simulation" from the COBRAxy tool suite and configure sampling parameters through the web interface.
|
|
30
|
|
31 ## Parameters
|
|
32
|
|
33 ### Required Parameters
|
|
34
|
|
35 | Parameter | Flag | Description |
|
|
36 |-----------|------|-------------|
|
|
37 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
|
|
38 | Input Bounds | `-in, --input` | Comma-separated list of bounds files |
|
|
39 | Sample Names | `-ni, --names` | Comma-separated sample names |
|
|
40 | Algorithm | `-a, --algorithm` | Sampling algorithm (CBS or OPTGP) |
|
|
41 | Number of Samples | `-ns, --n_samples` | Samples per batch |
|
|
42 | Number of Batches | `-nb, --n_batches` | Number of sampling batches |
|
|
43 | Random Seed | `-sd, --seed` | Random seed for reproducibility |
|
|
44 | Output Types | `-ot, --output_type` | Flux statistics to compute |
|
|
45
|
|
46 ### Model Parameters
|
|
47
|
|
48 | Parameter | Flag | Description | Default |
|
|
49 |-----------|------|-------------|---------|
|
|
50 | Model Selector | `-ms, --model_selector` | Built-in model (ENGRO2, Custom) | ENGRO2 |
|
|
51 | Custom Model | `-mo, --model` | Path to custom SBML model | - |
|
|
52 | Model Name | `-mn, --model_name` | Custom model filename | - |
|
|
53
|
|
54 ### Sampling Parameters
|
|
55
|
|
56 | Parameter | Flag | Description | Default |
|
|
57 |-----------|------|-------------|---------|
|
|
58 | Algorithm | `-a, --algorithm` | CBS or OPTGP | - |
|
|
59 | Thinning | `-th, --thinning` | OPTGP thinning parameter | 100 |
|
|
60 | Samples | `-ns, --n_samples` | Samples per batch | - |
|
|
61 | Batches | `-nb, --n_batches` | Number of batches | - |
|
|
62 | Seed | `-sd, --seed` | Random seed | - |
|
|
63
|
|
64 ### Output Parameters
|
|
65
|
|
66 | Parameter | Flag | Description | Options |
|
|
67 |-----------|------|-------------|---------|
|
|
68 | Output Types | `-ot, --output_type` | Flux statistics | mean,median,quantiles,fluxes |
|
|
69 | Analysis Types | `-ota, --output_type_analysis` | Additional analyses | pFBA,FVA,sensitivity |
|
|
70 | Output Path | `-idop, --output_path` | Results directory | flux_simulation/ |
|
|
71 | Output Log | `-ol, --out_log` | Log file path | - |
|
|
72
|
|
73 ## Algorithms
|
|
74
|
|
75 ### CBS (Constraint-Based Sampling)
|
|
76
|
|
77 **Method**: Random objective function optimization
|
|
78 - Generates random linear combinations of reactions
|
|
79 - Optimizes using LP solver (GLPK preferred, COBRApy fallback)
|
|
80 - Fast and memory-efficient
|
|
81 - Suitable for large models
|
|
82
|
|
83 **Advantages**:
|
|
84 - High performance with GLPK
|
|
85 - Good coverage of solution space
|
|
86 - Robust to model size
|
|
87
|
|
88 ### OPTGP (Optimal Growth Perturbation)
|
|
89
|
|
90 **Method**: MCMC-based sampling
|
|
91 - Markov Chain Monte Carlo with growth optimization
|
|
92 - Requires thinning to reduce autocorrelation
|
|
93 - More computationally intensive
|
|
94 - Better theoretical guarantees
|
|
95
|
|
96 **Advantages**:
|
|
97 - Uniform sampling guarantee
|
|
98 - Well-established method
|
|
99 - Good for smaller models
|
|
100
|
|
101 ## Input Formats
|
|
102
|
|
103 ### Bounds Files
|
|
104
|
|
105 Tab-separated format with reaction bounds:
|
|
106
|
|
107 ```
|
|
108 Reaction lower_bound upper_bound
|
|
109 R00001 -1000.0 1250.5
|
|
110 R00002 -650.2 1000.0
|
|
111 R00003 0.0 2150.8
|
|
112 ```
|
|
113
|
|
114 Multiple bounds files can be processed simultaneously by providing comma-separated paths.
|
|
115
|
|
116 ### Custom Model File (Optional)
|
|
117
|
|
118 SBML format metabolic model compatible with COBRApy.
|
|
119
|
|
120 ## Output Formats
|
|
121
|
|
122 ### Flux Statistics
|
|
123
|
|
124 #### Mean Fluxes (`mean.csv`)
|
|
125 ```
|
|
126 Reaction Sample1 Sample2 Sample3
|
|
127 R00001 15.23 -8.45 22.1
|
|
128 R00002 0.0 12.67 -5.3
|
|
129 R00003 45.8 38.2 51.7
|
|
130 ```
|
|
131
|
|
132 #### Median Fluxes (`median.csv`)
|
|
133 ```
|
|
134 Reaction Sample1 Sample2 Sample3
|
|
135 R00001 14.1 -7.8 21.5
|
|
136 R00002 0.0 11.9 -4.8
|
|
137 R00003 44.2 37.1 50.3
|
|
138 ```
|
|
139
|
|
140 #### Quantiles (`quantiles.csv`)
|
|
141 ```
|
|
142 Reaction Sample1_q1 Sample1_q2 Sample1_q3 Sample2_q1 ...
|
|
143 R00001 10.5 14.1 18.7 -12.3 ...
|
|
144 R00002 -2.1 0.0 1.8 8.9 ...
|
|
145 R00003 38.9 44.2 49.8 32.1 ...
|
|
146 ```
|
|
147
|
|
148 ### Additional Analyses
|
|
149
|
|
150 #### pFBA (`pFBA.csv`)
|
|
151 Parsimonious Flux Balance Analysis results:
|
|
152 ```
|
|
153 Reaction Sample1 Sample2 Sample3
|
|
154 R00001 12.5 -6.7 19.3
|
|
155 R00002 0.0 8.9 -3.2
|
|
156 R00003 41.2 35.8 47.9
|
|
157 ```
|
|
158
|
|
159 #### FVA (`FVA.csv`)
|
|
160 Flux Variability Analysis bounds:
|
|
161 ```
|
|
162 Reaction Sample1_min Sample1_max Sample2_min Sample2_max ...
|
|
163 R00001 -5.2 35.8 -25.3 8.7 ...
|
|
164 R00002 -8.9 8.9 0.0 28.4 ...
|
|
165 R00003 15.6 78.3 10.2 65.9 ...
|
|
166 ```
|
|
167
|
|
168 #### Sensitivity (`sensitivity.csv`)
|
|
169 Single reaction deletion effects:
|
|
170 ```
|
|
171 Reaction Sample1 Sample2 Sample3
|
|
172 R00001 0.98 0.95 0.97
|
|
173 R00002 1.0 0.87 1.0
|
|
174 R00003 0.23 0.19 0.31
|
|
175 ```
|
|
176
|
|
177 ## Examples
|
|
178
|
|
179 ### Basic CBS Sampling
|
|
180
|
|
181 ```bash
|
|
182 # Simple CBS sampling with statistics
|
|
183 flux_simulation -td /opt/COBRAxy \
|
|
184 -ms ENGRO2 \
|
|
185 -in sample1_bounds.tsv,sample2_bounds.tsv \
|
|
186 -ni Sample1,Sample2 \
|
|
187 -a CBS \
|
|
188 -ns 500 \
|
|
189 -nb 2 \
|
|
190 -sd 42 \
|
|
191 -ot mean,median \
|
|
192 -ota pFBA \
|
|
193 -idop cbs_results/
|
|
194 ```
|
|
195
|
|
196 ### Comprehensive OPTGP Analysis
|
|
197
|
|
198 ```bash
|
|
199 # Full analysis with OPTGP
|
|
200 flux_simulation -td /opt/COBRAxy \
|
|
201 -ms ENGRO2 \
|
|
202 -in bounds/*.tsv \
|
|
203 -ni Sample1,Sample2,Sample3,Control1,Control2 \
|
|
204 -a OPTGP \
|
|
205 -th 200 \
|
|
206 -ns 1000 \
|
|
207 -nb 1 \
|
|
208 -sd 123 \
|
|
209 -ot mean,median,quantiles,fluxes \
|
|
210 -ota pFBA,FVA,sensitivity \
|
|
211 -idop comprehensive_analysis/ \
|
|
212 -ol sampling.log
|
|
213 ```
|
|
214
|
|
215 ### Custom Model Sampling
|
|
216
|
|
217 ```bash
|
|
218 # Use custom model with CBS
|
|
219 flux_simulation -td /opt/COBRAxy \
|
|
220 -ms Custom \
|
|
221 -mo models/tissue_specific.xml \
|
|
222 -mn tissue_specific.xml \
|
|
223 -in patient_bounds.tsv \
|
|
224 -ni PatientA \
|
|
225 -a CBS \
|
|
226 -ns 2000 \
|
|
227 -nb 5 \
|
|
228 -sd 456 \
|
|
229 -ot mean,quantiles \
|
|
230 -ota FVA,sensitivity \
|
|
231 -idop patient_analysis/
|
|
232 ```
|
|
233
|
|
234 ### Batch Processing Multiple Conditions
|
|
235
|
|
236 ```bash
|
|
237 # Process multiple experimental conditions
|
|
238 flux_simulation -td /opt/COBRAxy \
|
|
239 -ms ENGRO2 \
|
|
240 -in ctrl1.tsv,ctrl2.tsv,treat1.tsv,treat2.tsv \
|
|
241 -ni Control1,Control2,Treatment1,Treatment2 \
|
|
242 -a CBS \
|
|
243 -ns 800 \
|
|
244 -nb 3 \
|
|
245 -sd 789 \
|
|
246 -ot mean,median,fluxes \
|
|
247 -ota pFBA,FVA \
|
|
248 -idop batch_conditions/
|
|
249 ```
|
|
250
|
|
251 ## Algorithm Selection Guide
|
|
252
|
|
253 ### Choose CBS When:
|
|
254 - Large models (>1000 reactions)
|
|
255 - High sample throughput required
|
|
256 - GLPK solver available
|
|
257 - Memory constraints present
|
|
258
|
|
259 ### Choose OPTGP When:
|
|
260 - Theoretical sampling guarantees needed
|
|
261 - Smaller models (<500 reactions)
|
|
262 - Sufficient computational resources
|
|
263 - Publication-quality sampling required
|
|
264
|
|
265 ## Performance Optimization
|
|
266
|
|
267 ### CBS Optimization
|
|
268 - Install GLPK and swiglpk for maximum performance
|
|
269 - Increase batch number rather than samples per batch
|
|
270 - Monitor memory usage for large models
|
|
271
|
|
272 ### OPTGP Optimization
|
|
273 - Adjust thinning based on model size (100-500)
|
|
274 - Use parallel processing when available
|
|
275 - Consider warmup period for chain convergence
|
|
276
|
|
277 ### General Tips
|
|
278 - Use appropriate sample sizes (500-2000 per condition)
|
|
279 - Balance batches vs samples for memory management
|
|
280 - Set consistent random seeds for reproducibility
|
|
281
|
|
282 ## Quality Control
|
|
283
|
|
284 ### Convergence Assessment
|
|
285 - Compare statistics across batches
|
|
286 - Check for systematic trends in sampling
|
|
287 - Validate against known flux ranges
|
|
288
|
|
289 ### Statistical Validation
|
|
290 - Ensure adequate sample sizes (n≥100 recommended)
|
|
291 - Check for outliers and artifacts
|
|
292 - Validate against experimental flux data when available
|
|
293
|
|
294 ### Output Verification
|
|
295 - Confirm mass balance constraints satisfied
|
|
296 - Check thermodynamic consistency
|
|
297 - Verify biological plausibility of results
|
|
298
|
|
299 ## Integration Workflow
|
|
300
|
|
301 ### Upstream Tools
|
|
302 - [RAS to Bounds](ras-to-bounds.md) - Generate constrained bounds from RAS
|
|
303 - [Model Setting](metabolic-model-setting.md) - Extract model components
|
|
304
|
|
305 ### Downstream Tools
|
|
306 - [Flux to Map](flux-to-map.md) - Visualize flux distributions on metabolic maps
|
|
307 - [MAREA](marea.md) - Statistical analysis of flux differences
|
|
308
|
|
309 ### Typical Pipeline
|
|
310
|
|
311 ```bash
|
|
312 # 1. Generate sample-specific bounds
|
|
313 ras_to_bounds -td /opt/COBRAxy -ms ENGRO2 -ir ras.tsv -idop bounds/
|
|
314
|
|
315 # 2. Sample fluxes from constrained models
|
|
316 flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv \
|
|
317 -ni Sample1,Sample2,Sample3 -a CBS -ns 1000 \
|
|
318 -ot mean,quantiles -ota pFBA,FVA -idop fluxes/
|
|
319
|
|
320 # 3. Visualize results on metabolic maps
|
|
321 flux_to_map -td /opt/COBRAxy -input_data_fluxes fluxes/mean.csv \
|
|
322 -choice_map ENGRO2 -idop flux_maps/
|
|
323 ```
|
|
324
|
|
325 ## Troubleshooting
|
|
326
|
|
327 ### Common Issues
|
|
328
|
|
329 **CBS sampling fails**
|
|
330 - GLPK installation issues → Install GLPK and swiglpk
|
|
331 - Model infeasibility → Check bounds constraints
|
|
332 - Memory errors → Reduce samples per batch
|
|
333
|
|
334 **OPTGP convergence problems**
|
|
335 - Poor mixing → Increase thinning parameter
|
|
336 - Slow convergence → Extend sampling time
|
|
337 - Chain stuck → Check model feasibility
|
|
338
|
|
339 **Output files missing**
|
|
340 - Insufficient disk space → Check available storage
|
|
341 - Permission errors → Verify write permissions
|
|
342 - Invalid sample names → Check naming conventions
|
|
343
|
|
344 ### Error Messages
|
|
345
|
|
346 | Error | Cause | Solution |
|
|
347 |-------|-------|----------|
|
|
348 | "GLPK solver failed" | Missing GLPK/swiglpk | Install GLPK libraries |
|
|
349 | "Model infeasible" | Over-constrained bounds | Relax constraints or check model |
|
|
350 | "Sampling timeout" | Insufficient time/resources | Reduce sample size or increase resources |
|
|
351
|
|
352 ### Performance Issues
|
|
353
|
|
354 **Slow sampling**
|
|
355 - Use CBS instead of OPTGP for speed
|
|
356 - Reduce model size if possible
|
|
357 - Increase system resources
|
|
358
|
|
359 **Memory errors**
|
|
360 - Lower samples per batch
|
|
361 - Process samples sequentially
|
|
362 - Use more efficient data formats
|
|
363
|
|
364 **Disk space issues**
|
|
365 - Monitor output file sizes
|
|
366 - Clean intermediate files
|
|
367 - Use compressed formats when possible
|
|
368
|
|
369 ## Advanced Usage
|
|
370
|
|
371 ### Custom Sampling Parameters
|
|
372
|
|
373 For fine-tuning sampling behavior, advanced users can modify:
|
|
374 - Objective function generation (CBS)
|
|
375 - MCMC parameters (OPTGP)
|
|
376 - Convergence criteria
|
|
377 - Output precision and format
|
|
378
|
|
379 ### Parallel Processing
|
|
380
|
|
381 ```bash
|
|
382 # Split sampling across multiple cores/nodes
|
|
383 for i in {1..4}; do
|
|
384 flux_simulation -td /opt/COBRAxy -ms ENGRO2 \
|
|
385 -in subset_${i}_bounds.tsv \
|
|
386 -ni Batch${i} -a CBS -ns 250 \
|
|
387 -sd $((42 + i)) -idop batch_${i}/ &
|
|
388 done
|
|
389 wait
|
|
390 ```
|
|
391
|
|
392 ### Result Aggregation
|
|
393
|
|
394 Combine results from multiple simulation runs:
|
|
395
|
|
396 ```bash
|
|
397 # Merge statistics files
|
|
398 python merge_flux_results.py -i batch_*/mean.csv -o combined_mean.csv
|
|
399 ```
|
|
400
|
|
401 ## See Also
|
|
402
|
|
403 - [RAS to Bounds](ras-to-bounds.md) - Generate input constraints
|
|
404 - [Flux to Map](flux-to-map.md) - Visualize flux results
|
|
405 - [CBS Algorithm Documentation](../tutorials/cbs-algorithm.md)
|
|
406 - [OPTGP Algorithm Documentation](../tutorials/optgp-algorithm.md) |