view COBRAxy/docs/tools/ras-to-bounds.md @ 509:5956dcf94277 draft default tip

Uploaded
author francesco_lapi
date Wed, 01 Oct 2025 15:34:21 +0000
parents 4ed95023af20
children
line wrap: on
line source

# RAS to Bounds

Apply Reaction Activity Scores (RAS) as constraints to metabolic model bounds.

## Overview

The RAS to Bounds tool integrates RAS values into metabolic model flux bounds, creating sample-specific constrained models for flux sampling. This enables personalized metabolic modeling based on gene expression patterns.

## Usage

### Command Line

```bash
ras_to_bounds -td /path/to/COBRAxy \
              -ms ENGRO2 \
              -ir ras_scores.tsv \
              -rs true \
              -mes allOpen \
              -idop constrained_bounds/
```

### Galaxy Interface

Select "RAS to Bounds" from the COBRAxy tool suite and configure model and constraint parameters.

## Parameters

### Required Parameters

| Parameter | Flag | Description |
|-----------|------|-------------|
| Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
| Model Selector | `-ms, --model_selector` | Built-in model (ENGRO2, Custom) |
| RAS Selector | `-rs, --ras_selector` | Enable RAS constraint application |

### Model Parameters

| Parameter | Flag | Description | Default |
|-----------|------|-------------|---------|
| Model Selector | `-ms, --model_selector` | Built-in model choice | ENGRO2 |
| Custom Model | `-mo, --model` | Path to custom SBML model | - |
| Model Name | `-mn, --model_name` | Custom model filename | - |

### Medium Parameters

| Parameter | Flag | Description | Default |
|-----------|------|-------------|---------|
| Medium Selector | `-mes, --medium_selector` | Medium configuration | allOpen |
| Custom Medium | `-meo, --medium` | Path to custom medium file | - |

### Constraint Parameters

| Parameter | Flag | Description | Default |
|-----------|------|-------------|---------|
| RAS Input | `-ir, --input_ras` | RAS scores TSV file | - |
| RAS Names | `-rn, --name` | Sample names for RAS data | - |
| Cell Class | `-cc, --cell_class` | Output cell class information | - |

### Output Parameters

| Parameter | Flag | Description | Default |
|-----------|------|-------------|---------|
| Output Path | `-idop, --output_path` | Directory for constrained bounds | ras_to_bounds/ |
| Output Log | `-ol, --out_log` | Log file path | - |

## Input Formats

### RAS Scores File

Tab-separated format with reactions as rows and samples as columns:

```
Reaction	Sample1	Sample2	Sample3	Control1	Control2
R00001	1.25	0.85	1.42	1.05	0.98
R00002	0.65	1.35	0.72	1.15	1.08
R00003	2.15	2.05	0.45	0.95	1.12
```

### Custom Model File (Optional)

SBML format metabolic model:
- XML format (.xml, .sbml)
- Compressed formats supported (.xml.gz, .xml.zip, .xml.bz2)
- Must contain valid reaction, metabolite, and gene definitions

### Custom Medium File (Optional)

Exchange reactions defining growth medium:

```
reaction
EX_glc__D_e
EX_o2_e
EX_pi_e
EX_nh4_e
```

## Algorithm

### Constraint Application

1. **Base Model Loading**: Load specified metabolic model and medium
2. **Bounds Extraction**: Extract original flux bounds for each reaction  
3. **RAS Integration**: For each sample and reaction:
   ```
   if RAS > 1.0:
       new_upper_bound = original_upper_bound * RAS
   if RAS < 1.0:  
       new_lower_bound = original_lower_bound * RAS
   ```
4. **Bounds Output**: Generate sample-specific bounds files

### Scaling Rules

- **RAS > 1**: Upregulated reactions → increased flux capacity
- **RAS < 1**: Downregulated reactions → decreased flux capacity  
- **RAS = 1**: No change from original bounds
- **Missing RAS**: Original bounds retained

## Output Format

### Bounds Files

One TSV file per sample with constrained bounds:

```
# bounds_Sample1.tsv
Reaction	lower_bound	upper_bound
R00001	-1000	1250.5
R00002	-650.2	1000  
R00003	-1000	2150.8
```

### Directory Structure

```
ras_to_bounds/
├── bounds_Sample1.tsv
├── bounds_Sample2.tsv  
├── bounds_Sample3.tsv
├── bounds_Control1.tsv
├── bounds_Control2.tsv
└── constraints_log.txt
```

## Examples

### Basic Usage with Built-in Model

```bash
# Apply RAS constraints to ENGRO2 model
ras_to_bounds -td /opt/COBRAxy \
              -ms ENGRO2 \
              -ir ras_data.tsv \
              -rs true \
              -idop results/bounds/
```

### Custom Model and Medium

```bash
# Use custom model with specific medium
ras_to_bounds -td /opt/COBRAxy \
              -ms Custom \
              -mo models/custom_model.xml \
              -mn custom_model.xml \
              -mes custom \
              -meo media/minimal_medium.tsv \
              -ir patient_ras.tsv \
              -rs true \
              -idop personalized_models/ \
              -ol constraints.log
```

### Multiple Sample Processing

```bash
# Process cohort data with sample classes
ras_to_bounds -td /opt/COBRAxy \
              -ms ENGRO2 \
              -ir cohort_ras_scores.tsv \
              -rn "Patient1,Patient2,Patient3,Healthy1,Healthy2" \
              -rs true \
              -cc sample_classes.tsv \
              -idop cohort_bounds/
```

## Built-in Models

### ENGRO2
- **Species**: Homo sapiens
- **Scope**: Genome-scale reconstruction  
- **Reactions**: ~2,000 reactions
- **Metabolites**: ~1,500 metabolites
- **Use Case**: General human metabolism

### Custom Model Requirements
- Valid SBML format
- Consistent reaction/metabolite naming
- Proper compartment definitions
- Gene-protein-reaction associations

## Medium Configurations

### allOpen (Default)
- All exchange reactions unconstrained
- Maximum metabolic flexibility
- Suitable for exploratory analysis

### Custom Medium
- User-defined nutrient availability
- Tissue-specific conditions
- Disease-specific constraints

## Quality Control

### Pre-processing Checks
- Verify RAS data completeness (recommend >80% reaction coverage)
- Check for extreme RAS values (>10 or <0.1 may indicate issues)
- Validate model consistency and solvability

### Post-processing Validation
- Confirm bounds files generated for all samples
- Check constraint log for warnings
- Test model feasibility with sample bounds

## Tips and Best Practices

### RAS Data Preparation
- **Normalization**: Ensure RAS values are properly normalized (median ~1.0)
- **Filtering**: Remove reactions with consistently missing data
- **Validation**: Check RAS distributions across samples

### Model Selection
- Use ENGRO2 for general human tissue analysis
- Consider custom models for specific organisms or tissues
- Validate model scope matches your biological question

### Medium Configuration  
- Match medium to experimental conditions
- Use minimal medium for growth requirement analysis
- Consider tissue-specific nutrient availability

## Integration Workflow

### Upstream Tools
- [RAS Generator](ras-generator.md) - Generate RAS scores from expression data

### Downstream Tools  
- [Flux Simulation](flux-simulation.md) - Sample fluxes using constrained bounds
- [MAREA](marea.md) - Statistical analysis of constraint effects

### Typical Pipeline

```bash
# 1. Generate RAS from expression data
ras_generator -td /opt/COBRAxy -in expression.tsv -ra ras.tsv

# 2. Apply RAS constraints to model bounds  
ras_to_bounds -td /opt/COBRAxy -ms ENGRO2 -ir ras.tsv -rs true -idop bounds/

# 3. Sample fluxes with constraints
flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv -a CBS -idop fluxes/

# 4. Analyze and visualize results
marea -td /opt/COBRAxy -input_data fluxes/mean.tsv -choice_map ENGRO2 -idop maps/
```

## Troubleshooting

### Common Issues

**No bounds files generated**
- Check RAS file format and sample names
- Verify model loading (check model path/format)
- Ensure sufficient disk space for output

**Model infeasibility after constraints**
- RAS values may be too restrictive
- Consider scaling factor adjustment
- Check medium compatibility with constraints

**Missing reactions in bounds**  
- RAS data may not cover all model reactions
- Original bounds retained for missing reactions
- Consider reaction mapping validation

### Error Messages

| Error | Cause | Solution |
|-------|-------|----------|
| "Model not found" | Invalid model path | Check model file location |
| "RAS file invalid" | Malformed TSV format | Verify file structure and encoding |
| "Infeasible solution" | Over-constrained model | Relax RAS scaling or medium constraints |

### Performance Issues

**Slow processing**
- Large models may require significant memory
- Consider batch processing for many samples
- Monitor system resource usage

**Memory errors**
- Reduce model size or split processing
- Increase available system memory
- Use more efficient file formats

## Advanced Usage

### Batch Processing Script

```bash
#!/bin/bash
# Process multiple RAS files
for ras_file in ras_data/*.tsv; do
    sample_name=$(basename "$ras_file" .tsv)
    ras_to_bounds -td /opt/COBRAxy \
                  -ms ENGRO2 \
                  -ir "$ras_file" \
                  -rs true \
                  -idop "bounds_$sample_name/"
done
```

### Custom Scaling Functions

For advanced users, RAS scaling can be customized by modifying the constraint application logic in the source code.

## See Also

- [RAS Generator](ras-generator.md) - Generate input RAS data
- [Flux Simulation](flux-simulation.md) - Use constrained bounds for sampling  
- [Model Setting](metabolic-model-setting.md) - Extract model components