|
492
|
1 <div align="center">
|
|
|
2 <img src="docs/_media/logo.png" alt="COBRAxy Logo" width="200"/>
|
|
|
3 </div>
|
|
456
|
4
|
|
492
|
5 # COBRAxy
|
|
456
|
6
|
|
492
|
7 A Python toolkit for metabolic flux analysis and visualization, with Galaxy integration.
|
|
|
8
|
|
|
9 COBRAxy transforms gene expression and metabolite data into meaningful metabolic insights through flux sampling and interactive pathway maps.
|
|
|
10 DOC: https://compbtbs.github.io/COBRAxy
|
|
|
11 ## Features
|
|
456
|
12
|
|
492
|
13 - **Reaction Activity Scores (RAS)** from gene expression data
|
|
|
14 - **Reaction Propensity Scores (RPS)** from metabolite abundance
|
|
|
15 - **Flux sampling** with CBS or OptGP algorithms
|
|
|
16 - **Statistical analysis** with pFBA, FVA, and sensitivity analysis
|
|
|
17 - **Interactive maps** with SVG/PDF export and custom styling
|
|
|
18 - **Galaxy tools** for web-based analysis
|
|
|
19 - **Built-in models** including ENGRO2 and Recon
|
|
456
|
20
|
|
492
|
21 ## Quick Start
|
|
456
|
22
|
|
492
|
23 ### Installation
|
|
456
|
24
|
|
|
25 ```bash
|
|
|
26 git clone https://github.com/CompBtBs/COBRAxy.git
|
|
|
27 cd COBRAxy
|
|
|
28 pip install .
|
|
|
29 ```
|
|
|
30
|
|
492
|
31 ### Basic Workflow
|
|
456
|
32
|
|
|
33 ```bash
|
|
492
|
34 # 1. Generate RAS from expression data
|
|
|
35 ras_generator -td $(pwd) -in expression.tsv -ra ras_output.tsv -rs ENGRO2
|
|
|
36
|
|
|
37 # 2. Generate RPS from metabolite data (optional)
|
|
|
38 rps_generator -td $(pwd) -id metabolites.tsv -rp rps_output.tsv
|
|
|
39
|
|
|
40 # 3. Create enriched pathway maps with statistical analysis
|
|
|
41 marea -td $(pwd) -using_RAS true -input_data ras_output.tsv -choice_map ENGRO2 -gs true -idop base_maps
|
|
|
42
|
|
|
43 # 4. Apply RAS constraints to model for flux simulation
|
|
|
44 ras_to_bounds -td $(pwd) -ms ENGRO2 -ir ras_output.tsv -rs true -idop bounds_output
|
|
|
45
|
|
|
46 # 5. Sample metabolic fluxes with constrained model
|
|
|
47 flux_simulation -td $(pwd) -ms ENGRO2 -in bounds_output/*.tsv -a CBS -ns 1000 -idop flux_results
|
|
|
48
|
|
|
49 # 6. Add flux data to enriched maps
|
|
|
50 flux_to_map -td $(pwd) -if flux_results/*.tsv -mp base_maps/*.svg -idop final_maps
|
|
|
51 ```
|
|
|
52
|
|
|
53 ## Tools
|
|
456
|
54
|
|
492
|
55 | Tool | Purpose | Input | Output |
|
|
|
56 |------|---------|--------|---------|
|
|
|
57 | `metabolic_model_setting` | Extract model components | SBML model | Rules, reactions, bounds, medium |
|
|
|
58 | `ras_generator` | Compute reaction activity scores | Gene expression data | RAS values |
|
|
|
59 | `rps_generator` | Compute reaction propensity scores | Metabolite abundance | RPS values |
|
|
|
60 | `marea` | Statistical pathway analysis | RAS + RPS data | Enrichment + base maps |
|
|
|
61 | `ras_to_bounds` | Apply RAS constraints to model | RAS + SBML model | Constrained bounds |
|
|
|
62 | `flux_simulation` | Sample metabolic fluxes | Constrained model | Flux distributions |
|
|
|
63 | `flux_to_map` | Add fluxes to enriched maps | Flux samples + base maps | Final styled maps |
|
|
|
64 | `marea_cluster` | Cluster analysis | Expression/flux data | Sample clusters |
|
|
|
65
|
|
|
66 ## Requirements
|
|
|
67
|
|
|
68 - **Python**: 3.8-3.11
|
|
|
69 - **OS**: Linux, macOS, Windows (Linux recommended)
|
|
|
70 - **Dependencies**: Automatically installed via pip (COBRApy, pandas, numpy, etc.)
|
|
|
71
|
|
|
72 **Optional system libraries** (for enhanced features):
|
|
|
73 ```bash
|
|
|
74 # Ubuntu/Debian
|
|
|
75 sudo apt-get install libvips libglpk40 glpk-utils
|
|
|
76
|
|
|
77 # For Python GLPK bindings
|
|
|
78 pip install swiglpk
|
|
|
79 ```
|
|
|
80
|
|
|
81 ## Data Flow
|
|
456
|
82
|
|
492
|
83 ```
|
|
|
84 Gene Expression Metabolite Data SBML Model
|
|
|
85 ↓ ↓ ↓
|
|
|
86 RAS Generator RPS Generator Model Tables
|
|
|
87 ↓ ↓
|
|
|
88 RAS Values RPS Values
|
|
|
89 | ↓ ↓
|
|
|
90 | └─────────┬─────────┘
|
|
|
91 | ↓
|
|
|
92 | MAREA
|
|
|
93 | (Enrichment +
|
|
|
94 | Base Maps)
|
|
|
95 ↓
|
|
|
96 RAS Values → RAS to Bounds ←── Model Tables
|
|
|
97 ↓
|
|
|
98 Constrained Model
|
|
|
99 ↓
|
|
|
100 Flux Simulation
|
|
|
101 ↓
|
|
|
102 Flux Samples
|
|
|
103 ↓
|
|
|
104 Flux to Map ←── Maps (ENGRO2)
|
|
|
105 ↓
|
|
|
106 Final Enriched Maps
|
|
|
107 ```
|
|
|
108
|
|
|
109 ## Built-in Models & Data
|
|
456
|
110
|
|
492
|
111 COBRAxy includes ready-to-use resources:
|
|
|
112
|
|
|
113 - **Models**: ENGRO2, Recon (human metabolism)
|
|
|
114 - **Gene mappings**: HGNC, Ensembl, Entrez ID conversions
|
|
|
115 - **Pathway maps**: Pre-styled SVG templates
|
|
|
116 - **Medium compositions**: Standard growth conditions
|
|
|
117
|
|
|
118 Located in `local/` directory for immediate use.
|
|
|
119
|
|
|
120 ## Command Line Usage
|
|
|
121
|
|
|
122 All tools support `--help` for detailed options. Key commands:
|
|
93
|
123
|
|
492
|
124 ### Generate RAS/RPS scores
|
|
|
125 ```bash
|
|
|
126 # From gene expression
|
|
|
127 ras_generator -td $(pwd) -in expression.tsv -ra ras_output.tsv -rs ENGRO2
|
|
|
128
|
|
|
129 # From metabolite data
|
|
|
130 rps_generator -td $(pwd) -id metabolites.tsv -rp rps_output.tsv
|
|
|
131 ```
|
|
|
132
|
|
|
133 ### Flux sampling
|
|
|
134 ```bash
|
|
|
135 flux_simulation -td $(pwd) -ms ENGRO2 -in bounds/*.tsv -a CBS -ns 1000 -idop results/
|
|
|
136 ```
|
|
|
137
|
|
|
138 ### Statistical analysis & visualization
|
|
|
139 ```bash
|
|
|
140 marea -td $(pwd) -using_RAS true -input_data ras.tsv -choice_map ENGRO2 -gs true -idop maps/
|
|
456
|
141 ```
|
|
|
142
|
|
492
|
143 ## Galaxy Integration
|
|
456
|
144
|
|
492
|
145 COBRAxy provides Galaxy tool wrappers (`.xml` files) for web-based analysis:
|
|
456
|
146
|
|
492
|
147 - Upload data through Galaxy interface
|
|
|
148 - Chain tools in visual workflows
|
|
|
149 - Share and reproduce analyses
|
|
|
150 - Access via Galaxy ToolShed
|
|
456
|
151
|
|
492
|
152 ## Tutorials
|
|
456
|
153
|
|
492
|
154 ### Local Galaxy Installation
|
|
456
|
155
|
|
492
|
156 To set up a local Galaxy instance with COBRAxy tools:
|
|
456
|
157
|
|
492
|
158 1. **Install Galaxy**:
|
|
|
159 ```bash
|
|
|
160 # Clone Galaxy repository
|
|
|
161 git clone -b release_23.1 https://github.com/galaxyproject/galaxy.git
|
|
|
162 cd galaxy
|
|
|
163
|
|
|
164 # Install dependencies and start Galaxy
|
|
|
165 sh run.sh
|
|
|
166 ```
|
|
456
|
167
|
|
492
|
168 2. **Install COBRAxy tools**:
|
|
|
169 ```bash
|
|
|
170 # Add COBRAxy tools to Galaxy
|
|
|
171 mkdir -p tools/cobraxy
|
|
|
172 cp path/to/COBRAxy/Galaxy_tools/*.xml tools/cobraxy/
|
|
|
173
|
|
|
174 # Update tool_conf.xml to include COBRAxy tools
|
|
|
175 # Add section in config/tool_conf.xml:
|
|
|
176 # <section id="cobraxy" name="COBRAxy">
|
|
|
177 # <tool file="cobraxy/ras_generator.xml" />
|
|
|
178 # <tool file="cobraxy/rps_generator.xml" />
|
|
|
179 # <tool file="cobraxy/marea.xml" />
|
|
|
180 # <!-- Add other tools -->
|
|
|
181 # </section>
|
|
|
182 ```
|
|
456
|
183
|
|
492
|
184 3. **Galaxy Tutorial Resources**:
|
|
|
185 - [Galaxy Installation Guide](https://docs.galaxyproject.org/en/master/admin/)
|
|
|
186 - [Tool Development Tutorial](https://training.galaxyproject.org/training-material/topics/dev/)
|
|
|
187 - [Galaxy Admin Training](https://training.galaxyproject.org/training-material/topics/admin/)
|
|
|
188
|
|
|
189 ### Python Direct Usage
|
|
456
|
190
|
|
492
|
191 For programmatic use of COBRAxy tools in Python scripts:
|
|
456
|
192
|
|
492
|
193 1. **Installation for Development**:
|
|
|
194 ```bash
|
|
|
195 # Clone and install in development mode
|
|
|
196 git clone https://github.com/CompBtBs/COBRAxy.git
|
|
|
197 cd COBRAxy
|
|
|
198 pip install -e .
|
|
|
199 ```
|
|
456
|
200
|
|
492
|
201 2. **Python API Usage**:
|
|
|
202 ```python
|
|
|
203 import sys
|
|
|
204 import os
|
|
|
205
|
|
|
206 # Add COBRAxy to Python path
|
|
|
207 sys.path.append('/path/to/COBRAxy')
|
|
|
208
|
|
|
209 # Import tool modules
|
|
|
210 import ras_generator
|
|
|
211 import rps_generator
|
|
|
212 import flux_simulation
|
|
|
213 import marea
|
|
|
214 import ras_to_bounds
|
|
|
215
|
|
|
216 # Set working directory
|
|
|
217 tool_dir = "/path/to/COBRAxy"
|
|
|
218 os.chdir(tool_dir)
|
|
|
219
|
|
|
220 # Generate RAS scores
|
|
|
221 ras_args = [
|
|
|
222 '-td', tool_dir,
|
|
|
223 '-in', 'data/expression.tsv',
|
|
|
224 '-ra', 'output/ras_values.tsv',
|
|
|
225 '-rs', 'ENGRO2'
|
|
|
226 ]
|
|
|
227 ras_generator.main(ras_args)
|
|
|
228
|
|
|
229 # Generate RPS scores (optional)
|
|
|
230 rps_args = [
|
|
|
231 '-td', tool_dir,
|
|
|
232 '-id', 'data/metabolites.tsv',
|
|
|
233 '-rp', 'output/rps_values.tsv'
|
|
|
234 ]
|
|
|
235 rps_generator.main(rps_args)
|
|
|
236
|
|
|
237 # Create enriched pathway maps
|
|
|
238 marea_args = [
|
|
|
239 '-td', tool_dir,
|
|
|
240 '-using_RAS', 'true',
|
|
|
241 '-input_data', 'output/ras_values.tsv',
|
|
|
242 '-choice_map', 'ENGRO2',
|
|
|
243 '-gs', 'true',
|
|
|
244 '-idop', 'maps'
|
|
|
245 ]
|
|
|
246 marea.main(marea_args)
|
|
|
247
|
|
|
248 # Apply RAS constraints to model
|
|
|
249 bounds_args = [
|
|
|
250 '-td', tool_dir,
|
|
|
251 '-ms', 'ENGRO2',
|
|
|
252 '-ir', 'output/ras_values.tsv',
|
|
|
253 '-rs', 'true',
|
|
|
254 '-idop', 'bounds'
|
|
|
255 ]
|
|
|
256 ras_to_bounds.main(bounds_args)
|
|
|
257
|
|
|
258 # Sample metabolic fluxes
|
|
|
259 flux_args = [
|
|
|
260 '-td', tool_dir,
|
|
|
261 '-ms', 'ENGRO2',
|
|
|
262 '-in', 'bounds/bounds_output.tsv',
|
|
|
263 '-a', 'CBS',
|
|
|
264 '-ns', '1000',
|
|
|
265 '-idop', 'flux_results'
|
|
|
266 ]
|
|
|
267 flux_simulation.main(flux_args)
|
|
|
268 ```
|
|
456
|
269
|
|
492
|
270 3. **Python Tutorial Resources**:
|
|
|
271 - [COBRApy Documentation](https://cobrapy.readthedocs.io/)
|
|
|
272 - [Metabolic Modeling with Python](https://opencobra.github.io/cobrapy/building_model.html)
|
|
|
273 - [Flux Sampling Tutorial](https://cobrapy.readthedocs.io/en/stable/sampling.html)
|
|
|
274 - [Jupyter Notebooks Examples](examples/) (included in repository)
|
|
456
|
275
|
|
492
|
276 ## Input/Output Formats
|
|
456
|
277
|
|
492
|
278 | Data Type | Format | Description |
|
|
|
279 |-----------|---------|-------------|
|
|
|
280 | Gene expression | TSV | Genes (rows) × Samples (columns) |
|
|
|
281 | Metabolites | TSV | Metabolites (rows) × Samples (columns) |
|
|
|
282 | Models | SBML | Standard metabolic model format |
|
|
|
283 | Results | TSV/CSV | Tabular flux/score data |
|
|
|
284 | Maps | SVG/PDF | Styled pathway visualizations |
|
|
456
|
285
|
|
|
286 ## Troubleshooting
|
|
|
287
|
|
492
|
288 **Common issues:**
|
|
|
289
|
|
|
290 - **Missing GLPK**: Install `glpk-utils` and `swiglpk` for optimal CBS performance
|
|
|
291 - **SVG errors**: Install `libvips` system library
|
|
|
292 - **Memory issues**: Reduce sampling count (`-ns`) or use fewer batches (`-nb`)
|
|
456
|
293
|
|
|
294 ## Contributing
|
|
|
295
|
|
492
|
296 Contributions welcome! Please:
|
|
|
297 - Follow existing code style
|
|
|
298 - Add documentation for new features
|
|
|
299 - Test with provided example data
|
|
|
300 - Submit focused pull requests
|
|
456
|
301
|
|
492
|
302 ## Citation
|
|
456
|
303
|
|
492
|
304 If you use COBRAxy in research, please cite:
|
|
|
305 - [COBRApy](https://opencobra.github.io/cobrapy/) for core metabolic modeling
|
|
|
306 - [MaREA](https://galaxyproject.org/use/marea4galaxy/) for enrichment methods
|
|
|
307 - This repository for integrated workflow
|
|
456
|
308
|
|
492
|
309 ## Links
|
|
|
310
|
|
|
311 - [COBRApy Documentation](https://opencobra.github.io/cobrapy/)
|
|
|
312 - [Galaxy Project](https://usegalaxy.org/)
|
|
|
313 - [GSoC 2024 Project](https://summerofcode.withgoogle.com/programs/2024/projects/LSrCKfq7)
|