Mercurial > repos > bimib > cobraxy
view COBRAxy/README.md @ 530:352c51a39e23 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Wed, 22 Oct 2025 12:18:07 +0000 |
| parents | 4ed95023af20 |
| children |
line wrap: on
line source
<div align="center"> <img src="docs/_media/logo.png" alt="COBRAxy Logo" width="200"/> </div> # COBRAxy A Python toolkit for metabolic flux analysis and visualization, with Galaxy integration. COBRAxy transforms gene expression and metabolite data into meaningful metabolic insights through flux sampling and interactive pathway maps. DOC: https://compbtbs.github.io/COBRAxy ## Features - **Reaction Activity Scores (RAS)** from gene expression data - **Reaction Propensity Scores (RPS)** from metabolite abundance - **Flux sampling** with CBS or OptGP algorithms - **Statistical analysis** with pFBA, FVA, and sensitivity analysis - **Interactive maps** with SVG/PDF export and custom styling - **Galaxy tools** for web-based analysis - **Built-in models** including ENGRO2 and Recon ## Quick Start ### Installation ```bash git clone https://github.com/CompBtBs/COBRAxy.git cd COBRAxy pip install . ``` ### Basic Workflow ```bash # 1. Generate RAS from expression data ras_generator -td $(pwd) -in expression.tsv -ra ras_output.tsv -rs ENGRO2 # 2. Generate RPS from metabolite data (optional) rps_generator -td $(pwd) -id metabolites.tsv -rp rps_output.tsv # 3. Create enriched pathway maps with statistical analysis marea -td $(pwd) -using_RAS true -input_data ras_output.tsv -choice_map ENGRO2 -gs true -idop base_maps # 4. Apply RAS constraints to model for flux simulation ras_to_bounds -td $(pwd) -ms ENGRO2 -ir ras_output.tsv -rs true -idop bounds_output # 5. Sample metabolic fluxes with constrained model flux_simulation -td $(pwd) -ms ENGRO2 -in bounds_output/*.tsv -a CBS -ns 1000 -idop flux_results # 6. Add flux data to enriched maps flux_to_map -td $(pwd) -if flux_results/*.tsv -mp base_maps/*.svg -idop final_maps ``` ## Tools | Tool | Purpose | Input | Output | |------|---------|--------|---------| | `metabolic_model_setting` | Extract model components | SBML model | Rules, reactions, bounds, medium | | `ras_generator` | Compute reaction activity scores | Gene expression data | RAS values | | `rps_generator` | Compute reaction propensity scores | Metabolite abundance | RPS values | | `marea` | Statistical pathway analysis | RAS + RPS data | Enrichment + base maps | | `ras_to_bounds` | Apply RAS constraints to model | RAS + SBML model | Constrained bounds | | `flux_simulation` | Sample metabolic fluxes | Constrained model | Flux distributions | | `flux_to_map` | Add fluxes to enriched maps | Flux samples + base maps | Final styled maps | | `marea_cluster` | Cluster analysis | Expression/flux data | Sample clusters | ## Requirements - **Python**: 3.8-3.11 - **OS**: Linux, macOS, Windows (Linux recommended) - **Dependencies**: Automatically installed via pip (COBRApy, pandas, numpy, etc.) **Optional system libraries** (for enhanced features): ```bash # Ubuntu/Debian sudo apt-get install libvips libglpk40 glpk-utils # For Python GLPK bindings pip install swiglpk ``` ## Data Flow ``` Gene Expression Metabolite Data SBML Model ↓ ↓ ↓ RAS Generator RPS Generator Model Tables ↓ ↓ RAS Values RPS Values | ↓ ↓ | └─────────┬─────────┘ | ↓ | MAREA | (Enrichment + | Base Maps) ↓ RAS Values → RAS to Bounds ←── Model Tables ↓ Constrained Model ↓ Flux Simulation ↓ Flux Samples ↓ Flux to Map ←── Maps (ENGRO2) ↓ Final Enriched Maps ``` ## Built-in Models & Data COBRAxy includes ready-to-use resources: - **Models**: ENGRO2, Recon (human metabolism) - **Gene mappings**: HGNC, Ensembl, Entrez ID conversions - **Pathway maps**: Pre-styled SVG templates - **Medium compositions**: Standard growth conditions Located in `local/` directory for immediate use. ## Command Line Usage All tools support `--help` for detailed options. Key commands: ### Generate RAS/RPS scores ```bash # From gene expression ras_generator -td $(pwd) -in expression.tsv -ra ras_output.tsv -rs ENGRO2 # From metabolite data rps_generator -td $(pwd) -id metabolites.tsv -rp rps_output.tsv ``` ### Flux sampling ```bash flux_simulation -td $(pwd) -ms ENGRO2 -in bounds/*.tsv -a CBS -ns 1000 -idop results/ ``` ### Statistical analysis & visualization ```bash marea -td $(pwd) -using_RAS true -input_data ras.tsv -choice_map ENGRO2 -gs true -idop maps/ ``` ## Galaxy Integration COBRAxy provides Galaxy tool wrappers (`.xml` files) for web-based analysis: - Upload data through Galaxy interface - Chain tools in visual workflows - Share and reproduce analyses - Access via Galaxy ToolShed ## Tutorials ### Local Galaxy Installation To set up a local Galaxy instance with COBRAxy tools: 1. **Install Galaxy**: ```bash # Clone Galaxy repository git clone -b release_23.1 https://github.com/galaxyproject/galaxy.git cd galaxy # Install dependencies and start Galaxy sh run.sh ``` 2. **Install COBRAxy tools**: ```bash # Add COBRAxy tools to Galaxy mkdir -p tools/cobraxy cp path/to/COBRAxy/Galaxy_tools/*.xml tools/cobraxy/ # Update tool_conf.xml to include COBRAxy tools # Add section in config/tool_conf.xml: # <section id="cobraxy" name="COBRAxy"> # <tool file="cobraxy/ras_generator.xml" /> # <tool file="cobraxy/rps_generator.xml" /> # <tool file="cobraxy/marea.xml" /> # <!-- Add other tools --> # </section> ``` 3. **Galaxy Tutorial Resources**: - [Galaxy Installation Guide](https://docs.galaxyproject.org/en/master/admin/) - [Tool Development Tutorial](https://training.galaxyproject.org/training-material/topics/dev/) - [Galaxy Admin Training](https://training.galaxyproject.org/training-material/topics/admin/) ### Python Direct Usage For programmatic use of COBRAxy tools in Python scripts: 1. **Installation for Development**: ```bash # Clone and install in development mode git clone https://github.com/CompBtBs/COBRAxy.git cd COBRAxy pip install -e . ``` 2. **Python API Usage**: ```python import sys import os # Add COBRAxy to Python path sys.path.append('/path/to/COBRAxy') # Import tool modules import ras_generator import rps_generator import flux_simulation import marea import ras_to_bounds # Set working directory tool_dir = "/path/to/COBRAxy" os.chdir(tool_dir) # Generate RAS scores ras_args = [ '-td', tool_dir, '-in', 'data/expression.tsv', '-ra', 'output/ras_values.tsv', '-rs', 'ENGRO2' ] ras_generator.main(ras_args) # Generate RPS scores (optional) rps_args = [ '-td', tool_dir, '-id', 'data/metabolites.tsv', '-rp', 'output/rps_values.tsv' ] rps_generator.main(rps_args) # Create enriched pathway maps marea_args = [ '-td', tool_dir, '-using_RAS', 'true', '-input_data', 'output/ras_values.tsv', '-choice_map', 'ENGRO2', '-gs', 'true', '-idop', 'maps' ] marea.main(marea_args) # Apply RAS constraints to model bounds_args = [ '-td', tool_dir, '-ms', 'ENGRO2', '-ir', 'output/ras_values.tsv', '-rs', 'true', '-idop', 'bounds' ] ras_to_bounds.main(bounds_args) # Sample metabolic fluxes flux_args = [ '-td', tool_dir, '-ms', 'ENGRO2', '-in', 'bounds/bounds_output.tsv', '-a', 'CBS', '-ns', '1000', '-idop', 'flux_results' ] flux_simulation.main(flux_args) ``` 3. **Python Tutorial Resources**: - [COBRApy Documentation](https://cobrapy.readthedocs.io/) - [Metabolic Modeling with Python](https://opencobra.github.io/cobrapy/building_model.html) - [Flux Sampling Tutorial](https://cobrapy.readthedocs.io/en/stable/sampling.html) - [Jupyter Notebooks Examples](examples/) (included in repository) ## Input/Output Formats | Data Type | Format | Description | |-----------|---------|-------------| | Gene expression | TSV | Genes (rows) × Samples (columns) | | Metabolites | TSV | Metabolites (rows) × Samples (columns) | | Models | SBML | Standard metabolic model format | | Results | TSV/CSV | Tabular flux/score data | | Maps | SVG/PDF | Styled pathway visualizations | ## Troubleshooting **Common issues:** - **Missing GLPK**: Install `glpk-utils` and `swiglpk` for optimal CBS performance - **SVG errors**: Install `libvips` system library - **Memory issues**: Reduce sampling count (`-ns`) or use fewer batches (`-nb`) ## Contributing Contributions welcome! Please: - Follow existing code style - Add documentation for new features - Test with provided example data - Submit focused pull requests ## Citation If you use COBRAxy in research, please cite: - [COBRApy](https://opencobra.github.io/cobrapy/) for core metabolic modeling - [MaREA](https://galaxyproject.org/use/marea4galaxy/) for enrichment methods - This repository for integrated workflow ## Links - [COBRApy Documentation](https://opencobra.github.io/cobrapy/) - [Galaxy Project](https://usegalaxy.org/) - [GSoC 2024 Project](https://summerofcode.withgoogle.com/programs/2024/projects/LSrCKfq7)
