Mercurial > repos > bimib > cobraxy
diff COBRAxy/docs/tools/ras-to-bounds.md @ 547:73f2f7e2be17 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Tue, 28 Oct 2025 10:44:07 +0000 |
| parents | fcdbc81feb45 |
| children |
line wrap: on
line diff
--- a/COBRAxy/docs/tools/ras-to-bounds.md Mon Oct 27 12:33:08 2025 +0000 +++ b/COBRAxy/docs/tools/ras-to-bounds.md Tue Oct 28 10:44:07 2025 +0000 @@ -4,330 +4,106 @@ ## Overview -The RAS to Bounds tool integrates RAS values into metabolic model flux bounds, creating sample-specific constrained models for flux sampling. This enables personalized metabolic modeling based on gene expression patterns. +RAS to Bounds integrates RAS values into metabolic model flux bounds, creating sample-specific constrained models. + +## Galaxy Interface + +In Galaxy: **COBRAxy → RAS to Bounds** + +1. Select model and upload RAS scores file +2. Configure medium and constraint options +3. Click **Execute** ## Usage -### Command Line - ```bash -ras_to_bounds -td /path/to/COBRAxy \ - -ms ENGRO2 \ +ras_to_bounds -ms ENGRO2 \ -ir ras_scores.tsv \ -rs true \ -mes allOpen \ -idop constrained_bounds/ ``` -### Galaxy Interface - -Select "RAS to Bounds" from the COBRAxy tool suite and configure model and constraint parameters. - ## Parameters -### Required Parameters - -| Parameter | Flag | Description | -|-----------|------|-------------| -| Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory | -| Model Selector | `-ms, --model_selector` | Built-in model (ENGRO2, Custom) | -| RAS Selector | `-rs, --ras_selector` | Enable RAS constraint application | - -### Model Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Model Selector | `-ms, --model_selector` | Built-in model choice | ENGRO2 | -| Custom Model | `-mo, --model` | Path to custom SBML model | - | -| Model Name | `-mn, --model_name` | Custom model filename | - | - -### Medium Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Medium Selector | `-mes, --medium_selector` | Medium configuration | allOpen | -| Custom Medium | `-meo, --medium` | Path to custom medium file | - | - -### Constraint Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| RAS Input | `-ir, --input_ras` | RAS scores TSV file | - | -| RAS Names | `-rn, --name` | Sample names for RAS data | - | -| Cell Class | `-cc, --cell_class` | Output cell class information | - | - -### Output Parameters - | Parameter | Flag | Description | Default | |-----------|------|-------------|---------| -| Output Path | `-idop, --output_path` | Directory for constrained bounds | ras_to_bounds/ | -| Output Log | `-ol, --out_log` | Log file path | - | - -## Input Formats - -### RAS Scores File - -Tab-separated format with reactions as rows and samples as columns: +| Model Selector | `-ms` | ENGRO2, Recon, or Custom | ENGRO2 | +| RAS Input | `-ir` | RAS scores TSV file | - | +| RAS Selector | `-rs` | Enable RAS constraints | false | +| Medium Selector | `-mes` | Medium configuration | allOpen | +| Save Models | `--save_models` | Save complete models with bounds | false | +| Output Path | `-idop` | Output directory | ras_to_bounds/ | +| Custom Model | `-mo` | Path to custom SBML model | - | +| Custom Medium | `-meo` | Custom medium file | - | -``` -Reaction Sample1 Sample2 Sample3 Control1 Control2 -R00001 1.25 0.85 1.42 1.05 0.98 -R00002 0.65 1.35 0.72 1.15 1.08 -R00003 2.15 2.05 0.45 0.95 1.12 -``` - -### Custom Model File (Optional) +## Input Format -SBML format metabolic model: -- XML format (.xml, .sbml) -- Compressed formats supported (.xml.gz, .xml.zip, .xml.bz2) -- Must contain valid reaction, metabolite, and gene definitions - -### Custom Medium File (Optional) - -Exchange reactions defining growth medium: +RAS scores file (TSV): ``` -reaction -EX_glc__D_e -EX_o2_e -EX_pi_e -EX_nh4_e +Reaction Sample1 Sample2 Sample3 +R00001 1.25 0.85 1.42 +R00002 0.65 1.35 0.72 ``` -## Algorithm - -### Constraint Application - -1. **Base Model Loading**: Load specified metabolic model and medium -2. **Bounds Extraction**: Extract original flux bounds for each reaction -3. **RAS Integration**: For each sample and reaction: - ``` - if RAS > 1.0: - new_upper_bound = original_upper_bound * RAS - if RAS < 1.0: - new_lower_bound = original_lower_bound * RAS - ``` -4. **Bounds Output**: Generate sample-specific bounds files - -### Scaling Rules - -- **RAS > 1**: Upregulated reactions → increased flux capacity -- **RAS < 1**: Downregulated reactions → decreased flux capacity -- **RAS = 1**: No change from original bounds -- **Missing RAS**: Original bounds retained +**File Format Notes:** +- Use **tab-separated** values (TSV) +- First row must contain column headers (Reaction, Sample names) +- Reaction IDs must match model reaction IDs +- Numeric values for RAS scores ## Output Format -### Bounds Files - -One TSV file per sample with constrained bounds: +Bounds files for each sample: ``` -# bounds_Sample1.tsv -Reaction lower_bound upper_bound -R00001 -1000 1250.5 -R00002 -650.2 1000 -R00003 -1000 2150.8 +reaction lower_bound upper_bound +R00001 -125.0 125.0 +R00002 -65.0 65.0 ``` -### Directory Structure +## Output Collections + +The tool generates three types of output: -``` -ras_to_bounds/ -├── bounds_Sample1.tsv -├── bounds_Sample2.tsv -├── bounds_Sample3.tsv -├── bounds_Control1.tsv -├── bounds_Control2.tsv -└── constraints_log.txt -``` +1. **Bounds files** (`ras_to_bounds/`): Individual bound files per sample (TSV format) +2. **Cell classes** (`cell_class`): Sample-to-class mapping file +3. **Complete models** (optional, `saved_models/`): Full tabular models with bounds applied + +To save complete models with integrated bounds, set `--save_models true`. This creates ready-to-use model files that can be directly used with Flux Simulation or other downstream tools. ## Examples -### Basic Usage with Built-in Model - -```bash -# Apply RAS constraints to ENGRO2 model -ras_to_bounds -td /opt/COBRAxy \ - -ms ENGRO2 \ - -ir ras_data.tsv \ - -rs true \ - -idop results/bounds/ -``` - -### Custom Model and Medium +### Basic Usage ```bash -# Use custom model with specific medium -ras_to_bounds -td /opt/COBRAxy \ - -ms Custom \ - -mo models/custom_model.xml \ - -mn custom_model.xml \ - -mes custom \ - -meo media/minimal_medium.tsv \ - -ir patient_ras.tsv \ +ras_to_bounds -ms ENGRO2 \ + -ir ras_scores.tsv \ -rs true \ - -idop personalized_models/ \ - -ol constraints.log -``` - -### Multiple Sample Processing - -```bash -# Process cohort data with sample classes -ras_to_bounds -td /opt/COBRAxy \ - -ms ENGRO2 \ - -ir cohort_ras_scores.tsv \ - -rn "Patient1,Patient2,Patient3,Healthy1,Healthy2" \ - -rs true \ - -cc sample_classes.tsv \ - -idop cohort_bounds/ + -idop output/ ``` -## Built-in Models - -### ENGRO2 -- **Species**: Homo sapiens -- **Scope**: Genome-scale reconstruction -- **Reactions**: ~2,000 reactions -- **Metabolites**: ~1,500 metabolites -- **Use Case**: General human metabolism - -### Custom Model Requirements -- Valid SBML format -- Consistent reaction/metabolite naming -- Proper compartment definitions -- Gene-protein-reaction associations - -## Medium Configurations - -### allOpen (Default) -- All exchange reactions unconstrained -- Maximum metabolic flexibility -- Suitable for exploratory analysis - -### Custom Medium -- User-defined nutrient availability -- Tissue-specific conditions -- Disease-specific constraints - -## Quality Control - -### Pre-processing Checks -- Verify RAS data completeness (recommend >80% reaction coverage) -- Check for extreme RAS values (>10 or <0.1 may indicate issues) -- Validate model consistency and solvability - -### Post-processing Validation -- Confirm bounds files generated for all samples -- Check constraint log for warnings -- Test model feasibility with sample bounds - -## Tips and Best Practices - -### RAS Data Preparation -- **Normalization**: Ensure RAS values are properly normalized (median ~1.0) -- **Filtering**: Remove reactions with consistently missing data -- **Validation**: Check RAS distributions across samples - -### Model Selection -- Use ENGRO2 for general human tissue analysis -- Consider custom models for specific organisms or tissues -- Validate model scope matches your biological question - -### Medium Configuration -- Match medium to experimental conditions -- Use minimal medium for growth requirement analysis -- Consider tissue-specific nutrient availability - -## Integration Workflow - -### Upstream Tools -- [RAS Generator](ras-generator.md) - Generate RAS scores from expression data - -### Downstream Tools -- [Flux Simulation](flux-simulation.md) - Sample fluxes using constrained bounds -- [MAREA](marea.md) - Statistical analysis of constraint effects - -### Typical Pipeline +### With Custom Model ```bash -# 1. Generate RAS from expression data -ras_generator -td /opt/COBRAxy -in expression.tsv -ra ras.tsv - -# 2. Apply RAS constraints to model bounds -ras_to_bounds -td /opt/COBRAxy -ms ENGRO2 -ir ras.tsv -rs true -idop bounds/ - -# 3. Sample fluxes with constraints -flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv -a CBS -idop fluxes/ - -# 4. Analyze and visualize results -marea -td /opt/COBRAxy -input_data fluxes/mean.tsv -choice_map ENGRO2 -idop maps/ +ras_to_bounds -ms Custom \ + -mo custom_model.xml \ + -ir ras_scores.tsv \ + -rs true \ + -idop output/ ``` ## Troubleshooting -### Common Issues - -**No bounds files generated** -- Check RAS file format and sample names -- Verify model loading (check model path/format) -- Ensure sufficient disk space for output - -**Model infeasibility after constraints** -- RAS values may be too restrictive -- Consider scaling factor adjustment -- Check medium compatibility with constraints - -**Missing reactions in bounds** -- RAS data may not cover all model reactions -- Original bounds retained for missing reactions -- Consider reaction mapping validation - -### Error Messages - -| Error | Cause | Solution | -|-------|-------|----------| -| "Model not found" | Invalid model path | Check model file location | -| "RAS file invalid" | Malformed TSV format | Verify file structure and encoding | -| "Infeasible solution" | Over-constrained model | Relax RAS scaling or medium constraints | - -### Performance Issues - -**Slow processing** -- Large models may require significant memory -- Consider batch processing for many samples -- Monitor system resource usage - -**Memory errors** -- Reduce model size or split processing -- Increase available system memory -- Use more efficient file formats - -## Advanced Usage - -### Batch Processing Script - -```bash -#!/bin/bash -# Process multiple RAS files -for ras_file in ras_data/*.tsv; do - sample_name=$(basename "$ras_file" .tsv) - ras_to_bounds -td /opt/COBRAxy \ - -ms ENGRO2 \ - -ir "$ras_file" \ - -rs true \ - -idop "bounds_$sample_name/" -done -``` - -### Custom Scaling Functions - -For advanced users, RAS scaling can be customized by modifying the constraint application logic in the source code. +| Error | Solution | +|-------|----------| +| "Model not found" | Check model file path | +| "RAS file invalid" | Verify TSV format | +| "Infeasible solution" | Relax RAS scaling or constraints | ## See Also -- [RAS Generator](ras-generator.md) - Generate input RAS data -- [Flux Simulation](flux-simulation.md) - Use constrained bounds for sampling -- [Import Metabolic Model](import-metabolic-model.md) - Extract model components \ No newline at end of file +- [RAS Generator](tools/ras-generator) +- [Flux Simulation](tools/flux-simulation) +- [Built-in Models](reference/built-in-models)
