diff COBRAxy/docs/tools/ras-to-bounds.md @ 492:4ed95023af20 draft

Uploaded
author francesco_lapi
date Tue, 30 Sep 2025 14:02:17 +0000
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/COBRAxy/docs/tools/ras-to-bounds.md	Tue Sep 30 14:02:17 2025 +0000
@@ -0,0 +1,333 @@
+# RAS to Bounds
+
+Apply Reaction Activity Scores (RAS) as constraints to metabolic model bounds.
+
+## Overview
+
+The RAS to Bounds tool integrates RAS values into metabolic model flux bounds, creating sample-specific constrained models for flux sampling. This enables personalized metabolic modeling based on gene expression patterns.
+
+## Usage
+
+### Command Line
+
+```bash
+ras_to_bounds -td /path/to/COBRAxy \
+              -ms ENGRO2 \
+              -ir ras_scores.tsv \
+              -rs true \
+              -mes allOpen \
+              -idop constrained_bounds/
+```
+
+### Galaxy Interface
+
+Select "RAS to Bounds" from the COBRAxy tool suite and configure model and constraint parameters.
+
+## Parameters
+
+### Required Parameters
+
+| Parameter | Flag | Description |
+|-----------|------|-------------|
+| Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
+| Model Selector | `-ms, --model_selector` | Built-in model (ENGRO2, Custom) |
+| RAS Selector | `-rs, --ras_selector` | Enable RAS constraint application |
+
+### Model Parameters
+
+| Parameter | Flag | Description | Default |
+|-----------|------|-------------|---------|
+| Model Selector | `-ms, --model_selector` | Built-in model choice | ENGRO2 |
+| Custom Model | `-mo, --model` | Path to custom SBML model | - |
+| Model Name | `-mn, --model_name` | Custom model filename | - |
+
+### Medium Parameters
+
+| Parameter | Flag | Description | Default |
+|-----------|------|-------------|---------|
+| Medium Selector | `-mes, --medium_selector` | Medium configuration | allOpen |
+| Custom Medium | `-meo, --medium` | Path to custom medium file | - |
+
+### Constraint Parameters
+
+| Parameter | Flag | Description | Default |
+|-----------|------|-------------|---------|
+| RAS Input | `-ir, --input_ras` | RAS scores TSV file | - |
+| RAS Names | `-rn, --name` | Sample names for RAS data | - |
+| Cell Class | `-cc, --cell_class` | Output cell class information | - |
+
+### Output Parameters
+
+| Parameter | Flag | Description | Default |
+|-----------|------|-------------|---------|
+| Output Path | `-idop, --output_path` | Directory for constrained bounds | ras_to_bounds/ |
+| Output Log | `-ol, --out_log` | Log file path | - |
+
+## Input Formats
+
+### RAS Scores File
+
+Tab-separated format with reactions as rows and samples as columns:
+
+```
+Reaction	Sample1	Sample2	Sample3	Control1	Control2
+R00001	1.25	0.85	1.42	1.05	0.98
+R00002	0.65	1.35	0.72	1.15	1.08
+R00003	2.15	2.05	0.45	0.95	1.12
+```
+
+### Custom Model File (Optional)
+
+SBML format metabolic model:
+- XML format (.xml, .sbml)
+- Compressed formats supported (.xml.gz, .xml.zip, .xml.bz2)
+- Must contain valid reaction, metabolite, and gene definitions
+
+### Custom Medium File (Optional)
+
+Exchange reactions defining growth medium:
+
+```
+reaction
+EX_glc__D_e
+EX_o2_e
+EX_pi_e
+EX_nh4_e
+```
+
+## Algorithm
+
+### Constraint Application
+
+1. **Base Model Loading**: Load specified metabolic model and medium
+2. **Bounds Extraction**: Extract original flux bounds for each reaction  
+3. **RAS Integration**: For each sample and reaction:
+   ```
+   if RAS > 1.0:
+       new_upper_bound = original_upper_bound * RAS
+   if RAS < 1.0:  
+       new_lower_bound = original_lower_bound * RAS
+   ```
+4. **Bounds Output**: Generate sample-specific bounds files
+
+### Scaling Rules
+
+- **RAS > 1**: Upregulated reactions → increased flux capacity
+- **RAS < 1**: Downregulated reactions → decreased flux capacity  
+- **RAS = 1**: No change from original bounds
+- **Missing RAS**: Original bounds retained
+
+## Output Format
+
+### Bounds Files
+
+One TSV file per sample with constrained bounds:
+
+```
+# bounds_Sample1.tsv
+Reaction	lower_bound	upper_bound
+R00001	-1000	1250.5
+R00002	-650.2	1000  
+R00003	-1000	2150.8
+```
+
+### Directory Structure
+
+```
+ras_to_bounds/
+├── bounds_Sample1.tsv
+├── bounds_Sample2.tsv  
+├── bounds_Sample3.tsv
+├── bounds_Control1.tsv
+├── bounds_Control2.tsv
+└── constraints_log.txt
+```
+
+## Examples
+
+### Basic Usage with Built-in Model
+
+```bash
+# Apply RAS constraints to ENGRO2 model
+ras_to_bounds -td /opt/COBRAxy \
+              -ms ENGRO2 \
+              -ir ras_data.tsv \
+              -rs true \
+              -idop results/bounds/
+```
+
+### Custom Model and Medium
+
+```bash
+# Use custom model with specific medium
+ras_to_bounds -td /opt/COBRAxy \
+              -ms Custom \
+              -mo models/custom_model.xml \
+              -mn custom_model.xml \
+              -mes custom \
+              -meo media/minimal_medium.tsv \
+              -ir patient_ras.tsv \
+              -rs true \
+              -idop personalized_models/ \
+              -ol constraints.log
+```
+
+### Multiple Sample Processing
+
+```bash
+# Process cohort data with sample classes
+ras_to_bounds -td /opt/COBRAxy \
+              -ms ENGRO2 \
+              -ir cohort_ras_scores.tsv \
+              -rn "Patient1,Patient2,Patient3,Healthy1,Healthy2" \
+              -rs true \
+              -cc sample_classes.tsv \
+              -idop cohort_bounds/
+```
+
+## Built-in Models
+
+### ENGRO2
+- **Species**: Homo sapiens
+- **Scope**: Genome-scale reconstruction  
+- **Reactions**: ~2,000 reactions
+- **Metabolites**: ~1,500 metabolites
+- **Use Case**: General human metabolism
+
+### Custom Model Requirements
+- Valid SBML format
+- Consistent reaction/metabolite naming
+- Proper compartment definitions
+- Gene-protein-reaction associations
+
+## Medium Configurations
+
+### allOpen (Default)
+- All exchange reactions unconstrained
+- Maximum metabolic flexibility
+- Suitable for exploratory analysis
+
+### Custom Medium
+- User-defined nutrient availability
+- Tissue-specific conditions
+- Disease-specific constraints
+
+## Quality Control
+
+### Pre-processing Checks
+- Verify RAS data completeness (recommend >80% reaction coverage)
+- Check for extreme RAS values (>10 or <0.1 may indicate issues)
+- Validate model consistency and solvability
+
+### Post-processing Validation
+- Confirm bounds files generated for all samples
+- Check constraint log for warnings
+- Test model feasibility with sample bounds
+
+## Tips and Best Practices
+
+### RAS Data Preparation
+- **Normalization**: Ensure RAS values are properly normalized (median ~1.0)
+- **Filtering**: Remove reactions with consistently missing data
+- **Validation**: Check RAS distributions across samples
+
+### Model Selection
+- Use ENGRO2 for general human tissue analysis
+- Consider custom models for specific organisms or tissues
+- Validate model scope matches your biological question
+
+### Medium Configuration  
+- Match medium to experimental conditions
+- Use minimal medium for growth requirement analysis
+- Consider tissue-specific nutrient availability
+
+## Integration Workflow
+
+### Upstream Tools
+- [RAS Generator](ras-generator.md) - Generate RAS scores from expression data
+
+### Downstream Tools  
+- [Flux Simulation](flux-simulation.md) - Sample fluxes using constrained bounds
+- [MAREA](marea.md) - Statistical analysis of constraint effects
+
+### Typical Pipeline
+
+```bash
+# 1. Generate RAS from expression data
+ras_generator -td /opt/COBRAxy -in expression.tsv -ra ras.tsv
+
+# 2. Apply RAS constraints to model bounds  
+ras_to_bounds -td /opt/COBRAxy -ms ENGRO2 -ir ras.tsv -rs true -idop bounds/
+
+# 3. Sample fluxes with constraints
+flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv -a CBS -idop fluxes/
+
+# 4. Analyze and visualize results
+marea -td /opt/COBRAxy -input_data fluxes/mean.tsv -choice_map ENGRO2 -idop maps/
+```
+
+## Troubleshooting
+
+### Common Issues
+
+**No bounds files generated**
+- Check RAS file format and sample names
+- Verify model loading (check model path/format)
+- Ensure sufficient disk space for output
+
+**Model infeasibility after constraints**
+- RAS values may be too restrictive
+- Consider scaling factor adjustment
+- Check medium compatibility with constraints
+
+**Missing reactions in bounds**  
+- RAS data may not cover all model reactions
+- Original bounds retained for missing reactions
+- Consider reaction mapping validation
+
+### Error Messages
+
+| Error | Cause | Solution |
+|-------|-------|----------|
+| "Model not found" | Invalid model path | Check model file location |
+| "RAS file invalid" | Malformed TSV format | Verify file structure and encoding |
+| "Infeasible solution" | Over-constrained model | Relax RAS scaling or medium constraints |
+
+### Performance Issues
+
+**Slow processing**
+- Large models may require significant memory
+- Consider batch processing for many samples
+- Monitor system resource usage
+
+**Memory errors**
+- Reduce model size or split processing
+- Increase available system memory
+- Use more efficient file formats
+
+## Advanced Usage
+
+### Batch Processing Script
+
+```bash
+#!/bin/bash
+# Process multiple RAS files
+for ras_file in ras_data/*.tsv; do
+    sample_name=$(basename "$ras_file" .tsv)
+    ras_to_bounds -td /opt/COBRAxy \
+                  -ms ENGRO2 \
+                  -ir "$ras_file" \
+                  -rs true \
+                  -idop "bounds_$sample_name/"
+done
+```
+
+### Custom Scaling Functions
+
+For advanced users, RAS scaling can be customized by modifying the constraint application logic in the source code.
+
+## See Also
+
+- [RAS Generator](ras-generator.md) - Generate input RAS data
+- [Flux Simulation](flux-simulation.md) - Use constrained bounds for sampling  
+- [Model Setting](metabolic-model-setting.md) - Extract model components
\ No newline at end of file