Mercurial > repos > bimib > cobraxy
comparison COBRAxy/docs/tools/flux-to-map.md @ 492:4ed95023af20 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Tue, 30 Sep 2025 14:02:17 +0000 |
| parents | |
| children | fcdbc81feb45 |
comparison
equal
deleted
inserted
replaced
| 491:7a413a5ec566 | 492:4ed95023af20 |
|---|---|
| 1 # Flux to Map | |
| 2 | |
| 3 Visualize metabolic flux data on pathway maps with statistical analysis and color coding. | |
| 4 | |
| 5 ## Overview | |
| 6 | |
| 7 Flux to Map performs statistical analysis on flux distribution data and generates color-coded metabolic pathway maps. It compares flux values between sample groups and highlights significantly different reactions with appropriate colors and line weights. | |
| 8 | |
| 9 ## Usage | |
| 10 | |
| 11 ### Command Line | |
| 12 | |
| 13 ```bash | |
| 14 flux_to_map -td /path/to/COBRAxy \ | |
| 15 -input_data_fluxes flux_data.tsv \ | |
| 16 -input_class_fluxes sample_groups.tsv \ | |
| 17 -comparison manyvsmany \ | |
| 18 -test ks \ | |
| 19 -pv 0.05 \ | |
| 20 -fc 1.5 \ | |
| 21 -choice_map ENGRO2 \ | |
| 22 -generate_svg true \ | |
| 23 -generate_pdf true \ | |
| 24 -idop flux_maps/ | |
| 25 ``` | |
| 26 | |
| 27 ### Galaxy Interface | |
| 28 | |
| 29 Select "Flux to Map" from the COBRAxy tool suite and configure flux analysis and visualization parameters. | |
| 30 | |
| 31 ## Parameters | |
| 32 | |
| 33 ### Required Parameters | |
| 34 | |
| 35 | Parameter | Flag | Description | | |
| 36 |-----------|------|-------------| | |
| 37 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory | | |
| 38 | |
| 39 ### Data Input Parameters | |
| 40 | |
| 41 | Parameter | Flag | Description | Default | | |
| 42 |-----------|------|-------------|---------| | |
| 43 | Flux Data | `-idf, --input_data_fluxes` | Flux values TSV file | - | | |
| 44 | Flux Classes | `-icf, --input_class_fluxes` | Sample group labels for fluxes | - | | |
| 45 | Multiple Flux Files | `-idsf, --input_datas_fluxes` | Multiple flux datasets (space-separated) | - | | |
| 46 | Flux Names | `-naf, --names_fluxes` | Names for multiple flux datasets | - | | |
| 47 | Analysis Option | `-op, --option` | Analysis mode (datasets or dataset_class) | - | | |
| 48 | |
| 49 ### Statistical Parameters | |
| 50 | |
| 51 | Parameter | Flag | Description | Default | | |
| 52 |-----------|------|-------------|---------| | |
| 53 | Comparison Type | `-co, --comparison` | Statistical comparison mode | manyvsmany | | |
| 54 | Statistical Test | `-te, --test` | Statistical test method | ks | | |
| 55 | P-Value Threshold | `-pv, --pValue` | Significance threshold | 0.1 | | |
| 56 | Adjusted P-values | `-adj, --adjusted` | Apply FDR correction | false | | |
| 57 | Fold Change | `-fc, --fChange` | Minimum fold change threshold | 1.5 | | |
| 58 | |
| 59 ### Visualization Parameters | |
| 60 | |
| 61 | Parameter | Flag | Description | Default | | |
| 62 |-----------|------|-------------|---------| | |
| 63 | Map Choice | `-mc, --choice_map` | Built-in metabolic map | HMRcore | | |
| 64 | Custom Map | `-cm, --custom_map` | Path to custom SVG map | - | | |
| 65 | Generate SVG | `-gs, --generate_svg` | Create SVG output | true | | |
| 66 | Generate PDF | `-gp, --generate_pdf` | Create PDF output | true | | |
| 67 | Color Map | `-colorm, --color_map` | Color scheme (jet, viridis) | - | | |
| 68 | Output Directory | `-idop, --output_path` | Results directory | result/ | | |
| 69 | |
| 70 ### Advanced Parameters | |
| 71 | |
| 72 | Parameter | Flag | Description | Default | | |
| 73 |-----------|------|-------------|---------| | |
| 74 | Output Log | `-ol, --out_log` | Log file path | - | | |
| 75 | Control Sample | `-on, --control` | Control group identifier | - | | |
| 76 | |
| 77 ## Input Formats | |
| 78 | |
| 79 ### Flux Data File | |
| 80 | |
| 81 Tab-separated format with reactions as rows and samples as columns: | |
| 82 | |
| 83 ``` | |
| 84 Reaction Sample1 Sample2 Sample3 Control1 Control2 | |
| 85 R00001 15.23 -8.45 22.1 12.8 14.2 | |
| 86 R00002 0.0 12.67 -5.3 8.9 7.4 | |
| 87 R00003 45.8 38.2 51.7 42.1 39.8 | |
| 88 R00004 -12.4 -15.8 -9.2 -11.5 -13.1 | |
| 89 ``` | |
| 90 | |
| 91 ### Sample Class File | |
| 92 | |
| 93 Group assignment for statistical comparisons: | |
| 94 | |
| 95 ``` | |
| 96 Sample Class | |
| 97 Sample1 Treatment | |
| 98 Sample2 Treatment | |
| 99 Sample3 Treatment | |
| 100 Control1 Control | |
| 101 Control2 Control | |
| 102 ``` | |
| 103 | |
| 104 ### Multiple Dataset Format | |
| 105 | |
| 106 When using multiple flux files, provide space-separated paths and corresponding names: | |
| 107 | |
| 108 ```bash | |
| 109 -idsf "dataset1_flux.tsv dataset2_flux.tsv dataset3_flux.tsv" | |
| 110 -naf "Condition_A Condition_B Condition_C" | |
| 111 ``` | |
| 112 | |
| 113 ## Statistical Analysis | |
| 114 | |
| 115 ### Comparison Types | |
| 116 | |
| 117 #### manyvsmany | |
| 118 Compare all possible group pairs: | |
| 119 - Treatment vs Control | |
| 120 - Condition_A vs Condition_B | |
| 121 - Condition_A vs Condition_C | |
| 122 - Condition_B vs Condition_C | |
| 123 | |
| 124 #### onevsrest | |
| 125 Compare each group against all others combined: | |
| 126 - Treatment vs (Control + Other) | |
| 127 - Control vs (Treatment + Other) | |
| 128 | |
| 129 #### onevsmany | |
| 130 Compare one reference group against each other group: | |
| 131 - Control vs Treatment | |
| 132 - Control vs Condition_A | |
| 133 - Control vs Condition_B | |
| 134 | |
| 135 ### Statistical Tests | |
| 136 | |
| 137 | Test | Description | Best For | | |
| 138 |------|-------------|----------| | |
| 139 | `ks` | Kolmogorov-Smirnov | Non-parametric, distribution-free | | |
| 140 | `ttest_p` | Paired t-test | Related samples, normal distributions | | |
| 141 | `ttest_ind` | Independent t-test | Independent samples, normal distributions | | |
| 142 | `wilcoxon` | Wilcoxon signed-rank | Non-parametric paired comparisons | | |
| 143 | `mw` | Mann-Whitney U | Non-parametric independent comparisons | | |
| 144 | |
| 145 ### Significance Assessment | |
| 146 | |
| 147 Reactions are considered significant when: | |
| 148 1. **P-value** ≤ specified threshold (default: 0.1) | |
| 149 2. **Fold change** ≥ specified threshold (default: 1.5) | |
| 150 3. **FDR correction** (if enabled) maintains significance | |
| 151 | |
| 152 ## Map Visualization | |
| 153 | |
| 154 ### Built-in Maps | |
| 155 | |
| 156 #### HMRcore (Default) | |
| 157 - **Scope**: Core human metabolic network | |
| 158 - **Reactions**: ~300 essential reactions | |
| 159 - **Coverage**: Central carbon, amino acid, lipid metabolism | |
| 160 - **Use Case**: General overview, publication figures | |
| 161 | |
| 162 #### ENGRO2 | |
| 163 - **Scope**: Extended human genome-scale reconstruction | |
| 164 - **Reactions**: ~2,000 reactions | |
| 165 - **Coverage**: Comprehensive metabolic network | |
| 166 - **Use Case**: Detailed analysis, specialized tissues | |
| 167 | |
| 168 #### Custom Maps | |
| 169 User-provided SVG files with reaction elements: | |
| 170 ```xml | |
| 171 <rect id="R00001" class="reaction" fill="gray" stroke="black"/> | |
| 172 <path id="R00002" class="reaction" fill="gray" stroke="black"/> | |
| 173 ``` | |
| 174 | |
| 175 ### Color Coding Scheme | |
| 176 | |
| 177 #### Significance Colors | |
| 178 - **Red Gradient**: Significantly upregulated (positive fold change) | |
| 179 - **Blue Gradient**: Significantly downregulated (negative fold change) | |
| 180 - **Gray**: Not statistically significant | |
| 181 - **White**: No data available | |
| 182 | |
| 183 #### Visual Elements | |
| 184 - **Line Width**: Proportional to fold change magnitude | |
| 185 - **Color Intensity**: Proportional to statistical significance (-log10 p-value) | |
| 186 - **Transparency**: Indicates confidence level | |
| 187 | |
| 188 ### Color Maps | |
| 189 | |
| 190 #### Jet (Default) | |
| 191 - High contrast color transitions | |
| 192 - Blue (low) → Green → Yellow → Red (high) | |
| 193 - Good for identifying extreme values | |
| 194 | |
| 195 #### Viridis | |
| 196 - Perceptually uniform color scale | |
| 197 - Colorblind-friendly | |
| 198 - Purple (low) → Blue → Green → Yellow (high) | |
| 199 | |
| 200 ## Output Files | |
| 201 | |
| 202 ### Statistical Results | |
| 203 - `flux_statistics.tsv`: P-values, fold changes, test statistics for all reactions | |
| 204 - `significant_fluxes.tsv`: Only reactions meeting significance criteria | |
| 205 - `comparison_summary.txt`: Analysis parameters and summary statistics | |
| 206 | |
| 207 ### Visualizations | |
| 208 - `flux_map.svg`: Interactive color-coded pathway map | |
| 209 - `flux_map.pdf`: High-resolution PDF (if requested) | |
| 210 - `flux_map.png`: Raster image (if requested) | |
| 211 - `legend.svg`: Color scale and statistical significance legend | |
| 212 | |
| 213 ### Analysis Files | |
| 214 - `fold_changes.tsv`: Detailed fold change calculations | |
| 215 - `group_statistics.tsv`: Per-group summary statistics | |
| 216 - `comparison_matrix.tsv`: Pairwise comparison results | |
| 217 | |
| 218 ## Examples | |
| 219 | |
| 220 ### Basic Flux Comparison | |
| 221 | |
| 222 ```bash | |
| 223 # Compare treatment vs control fluxes | |
| 224 flux_to_map -td /opt/COBRAxy \ | |
| 225 -idf treatment_vs_control_fluxes.tsv \ | |
| 226 -icf sample_groups.tsv \ | |
| 227 -co manyvsmany \ | |
| 228 -te ks \ | |
| 229 -pv 0.05 \ | |
| 230 -fc 2.0 \ | |
| 231 -mc HMRcore \ | |
| 232 -gs true \ | |
| 233 -gp true \ | |
| 234 -idop flux_comparison/ | |
| 235 ``` | |
| 236 | |
| 237 ### Multiple Condition Analysis | |
| 238 | |
| 239 ```bash | |
| 240 # Compare multiple experimental conditions | |
| 241 flux_to_map -td /opt/COBRAxy \ | |
| 242 -idsf "cond1_flux.tsv cond2_flux.tsv cond3_flux.tsv" \ | |
| 243 -naf "Control Treatment1 Treatment2" \ | |
| 244 -co onevsrest \ | |
| 245 -te wilcoxon \ | |
| 246 -adj true \ | |
| 247 -pv 0.01 \ | |
| 248 -fc 1.8 \ | |
| 249 -mc ENGRO2 \ | |
| 250 -colorm viridis \ | |
| 251 -idop multi_condition_flux/ | |
| 252 ``` | |
| 253 | |
| 254 ### Custom Map Visualization | |
| 255 | |
| 256 ```bash | |
| 257 # Use tissue-specific custom map | |
| 258 flux_to_map -td /opt/COBRAxy \ | |
| 259 -idf liver_flux_data.tsv \ | |
| 260 -icf liver_conditions.tsv \ | |
| 261 -co manyvsmany \ | |
| 262 -te ttest_ind \ | |
| 263 -pv 0.05 \ | |
| 264 -fc 1.5 \ | |
| 265 -cm maps/liver_specific_map.svg \ | |
| 266 -gs true \ | |
| 267 -gp true \ | |
| 268 -idop liver_flux_analysis/ \ | |
| 269 -ol liver_analysis.log | |
| 270 ``` | |
| 271 | |
| 272 ### High-Throughput Analysis | |
| 273 | |
| 274 ```bash | |
| 275 # Process multiple datasets with stringent criteria | |
| 276 flux_to_map -td /opt/COBRAxy \ | |
| 277 -idsf "exp1.tsv exp2.tsv exp3.tsv exp4.tsv" \ | |
| 278 -naf "Exp1 Exp2 Exp3 Exp4" \ | |
| 279 -co manyvsmany \ | |
| 280 -te ks \ | |
| 281 -adj true \ | |
| 282 -pv 0.001 \ | |
| 283 -fc 3.0 \ | |
| 284 -mc HMRcore \ | |
| 285 -colorm jet \ | |
| 286 -gs true \ | |
| 287 -gp true \ | |
| 288 -idop high_throughput_flux/ | |
| 289 ``` | |
| 290 | |
| 291 ## Quality Control | |
| 292 | |
| 293 ### Data Validation | |
| 294 | |
| 295 #### Pre-analysis Checks | |
| 296 - Verify flux value distributions (check for outliers) | |
| 297 - Ensure sample names match between data and class files | |
| 298 - Validate reaction coverage across samples | |
| 299 - Check for missing values and their patterns | |
| 300 | |
| 301 #### Statistical Validation | |
| 302 - Assess normality assumptions for parametric tests | |
| 303 - Verify adequate sample sizes per group (n≥3 recommended) | |
| 304 - Check variance homogeneity between groups | |
| 305 - Evaluate multiple testing burden | |
| 306 | |
| 307 ### Result Interpretation | |
| 308 | |
| 309 #### Biological Validation | |
| 310 - Compare results with known pathway activities | |
| 311 - Check for pathway coherence (related reactions should cluster) | |
| 312 - Validate against literature or experimental evidence | |
| 313 - Assess metabolic network connectivity | |
| 314 | |
| 315 #### Technical Validation | |
| 316 - Compare results across different statistical tests | |
| 317 - Check sensitivity to parameter changes | |
| 318 - Validate fold change calculations | |
| 319 - Verify map element correspondence | |
| 320 | |
| 321 ## Tips and Best Practices | |
| 322 | |
| 323 ### Data Preparation | |
| 324 - **Normalization**: Ensure consistent flux units across samples | |
| 325 - **Filtering**: Remove reactions with excessive missing values (>50%) | |
| 326 - **Outlier Detection**: Identify and handle extreme flux values | |
| 327 - **Batch Effects**: Account for technical variation between experiments | |
| 328 | |
| 329 ### Statistical Considerations | |
| 330 - Use FDR correction for multiple comparisons (`-adj true`) | |
| 331 - Choose appropriate statistical tests based on data distribution | |
| 332 - Consider effect size (fold change) alongside significance | |
| 333 - Validate results with independent datasets when possible | |
| 334 | |
| 335 ### Visualization Optimization | |
| 336 - Select appropriate color maps for your audience | |
| 337 - Use high fold change thresholds (>2.0) for cleaner maps | |
| 338 - Export both SVG (editable) and PDF (publication) formats | |
| 339 - Include comprehensive legends and annotations | |
| 340 | |
| 341 ### Performance Tips | |
| 342 - Use HMRcore for faster processing and clearer visualizations | |
| 343 - Reduce dataset size for initial exploratory analysis | |
| 344 - Process large datasets in batches if memory constrained | |
| 345 - Cache intermediate results for parameter optimization | |
| 346 | |
| 347 ## Integration Workflow | |
| 348 | |
| 349 ### Upstream Tools | |
| 350 - [Flux Simulation](flux-simulation.md) - Generate flux distributions for comparison | |
| 351 - [MAREA](marea.md) - Alternative analysis pathway for RAS/RPS data | |
| 352 | |
| 353 ### Downstream Analysis | |
| 354 - Export results to statistical software (R, Python) for advanced analysis | |
| 355 - Integrate with pathway databases (KEGG, Reactome) | |
| 356 - Combine with other omics data for systems-level insights | |
| 357 | |
| 358 ### Typical Pipeline | |
| 359 | |
| 360 ```bash | |
| 361 # 1. Generate flux samples from constrained models | |
| 362 flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv \ | |
| 363 -ni Sample1,Sample2,Control1,Control2 -a CBS \ | |
| 364 -ot mean -idop fluxes/ | |
| 365 | |
| 366 # 2. Analyze and visualize flux differences | |
| 367 flux_to_map -td /opt/COBRAxy -idf fluxes/mean.csv \ | |
| 368 -icf sample_groups.tsv -co manyvsmany -te ks \ | |
| 369 -mc HMRcore -gs true -gp true -idop flux_maps/ | |
| 370 | |
| 371 # 3. Further analysis with custom scripts | |
| 372 python analyze_flux_results.py -i flux_maps/ -o final_results/ | |
| 373 ``` | |
| 374 | |
| 375 ## Troubleshooting | |
| 376 | |
| 377 ### Common Issues | |
| 378 | |
| 379 **No significant reactions found** | |
| 380 - Lower p-value threshold (`-pv 0.2`) | |
| 381 - Reduce fold change requirement (`-fc 1.2`) | |
| 382 - Check sample group definitions and sizes | |
| 383 - Verify flux data quality and normalization | |
| 384 | |
| 385 **Map rendering problems** | |
| 386 - Check SVG map file integrity and format | |
| 387 - Verify reaction ID matching between data and map | |
| 388 - Ensure sufficient system memory for large maps | |
| 389 - Validate XML structure of custom maps | |
| 390 | |
| 391 **Statistical test failures** | |
| 392 - Check data distribution assumptions | |
| 393 - Verify sufficient sample sizes per group | |
| 394 - Consider alternative non-parametric tests | |
| 395 - Examine variance patterns between groups | |
| 396 | |
| 397 ### Error Messages | |
| 398 | |
| 399 | Error | Cause | Solution | | |
| 400 |-------|-------|----------| | |
| 401 | "Map file not found" | Missing/invalid map path | Check file location and format | | |
| 402 | "No matching reactions" | ID mismatch between data and map | Verify reaction naming consistency | | |
| 403 | "Insufficient data" | Too few samples per group | Increase sample sizes or merge groups | | |
| 404 | "Memory allocation failed" | Large dataset/map combination | Reduce data size or increase system memory | | |
| 405 | |
| 406 ### Performance Issues | |
| 407 | |
| 408 **Slow processing** | |
| 409 - Use HMRcore instead of ENGRO2 for faster rendering | |
| 410 - Reduce dataset size for testing | |
| 411 - Process subsets of reactions separately | |
| 412 - Monitor system resource usage | |
| 413 | |
| 414 **Large output files** | |
| 415 - Use compressed formats when possible | |
| 416 - Reduce map resolution for preliminary analysis | |
| 417 - Export only essential output formats | |
| 418 - Clean temporary files regularly | |
| 419 | |
| 420 ## Advanced Usage | |
| 421 | |
| 422 ### Custom Statistical Functions | |
| 423 | |
| 424 Advanced users can implement custom statistical tests by modifying the analysis functions: | |
| 425 | |
| 426 ```python | |
| 427 def custom_test(group1, group2): | |
| 428 # Custom statistical test implementation | |
| 429 statistic, pvalue = your_test_function(group1, group2) | |
| 430 return statistic, pvalue | |
| 431 ``` | |
| 432 | |
| 433 ### Batch Processing Script | |
| 434 | |
| 435 Process multiple experiments systematically: | |
| 436 | |
| 437 ```bash | |
| 438 #!/bin/bash | |
| 439 experiments=("exp1" "exp2" "exp3" "exp4") | |
| 440 for exp in "${experiments[@]}"; do | |
| 441 flux_to_map -td /opt/COBRAxy \ | |
| 442 -idf "data/${exp}_flux.tsv" \ | |
| 443 -icf "data/${exp}_classes.tsv" \ | |
| 444 -co manyvsmany -te ks -pv 0.05 \ | |
| 445 -mc HMRcore -gs true -gp true \ | |
| 446 -idop "results/${exp}/" | |
| 447 done | |
| 448 ``` | |
| 449 | |
| 450 ### Result Aggregation | |
| 451 | |
| 452 Combine results across multiple analyses: | |
| 453 | |
| 454 ```bash | |
| 455 # Merge significant reactions across experiments | |
| 456 python merge_flux_results.py \ | |
| 457 -i results/exp*/significant_fluxes.tsv \ | |
| 458 -o combined_significant_reactions.tsv \ | |
| 459 --method intersection | |
| 460 ``` | |
| 461 | |
| 462 ## See Also | |
| 463 | |
| 464 - [Flux Simulation](flux-simulation.md) - Generate input flux distributions | |
| 465 - [MAREA](marea.md) - Alternative pathway analysis approach | |
| 466 - [Custom Map Creation Guide](../tutorials/custom-map-creation.md) | |
| 467 - [Statistical Methods Reference](../tutorials/statistical-methods.md) |
