comparison COBRAxy/docs/tools/rps-generator.md @ 547:73f2f7e2be17 draft

Uploaded
author francesco_lapi
date Tue, 28 Oct 2025 10:44:07 +0000
parents 4ed95023af20
children
comparison
equal deleted inserted replaced
546:01147e83f43c 547:73f2f7e2be17
1 # RPS Generator 1 # RPS Generator
2 2
3 Generate Reaction Propensity Scores (RPS) from metabolite abundance data. 3 Compute Reaction Presence Scores (RPS) from metabolite abundance data.
4 4
5 ## Overview 5 ## Overview
6 6
7 The RPS Generator computes reaction propensity scores based on metabolite abundance measurements. RPS values indicate how likely metabolic reactions are to be active based on the availability of their substrate and product metabolites. 7 RPS Generator calculates reaction presence scores based on metabolite availability in reaction formulas.
8
9 ## Galaxy Interface
10
11 In Galaxy: **COBRAxy → RPS Generator**
12
13 1. Select built-in model or upload custom reactions
14 2. Upload metabolite abundance data
15 3. Click **Execute**
8 16
9 ## Usage 17 ## Usage
10 18
11 ### Command Line
12
13 ```bash 19 ```bash
14 rps_generator -td /path/to/COBRAxy \ 20 rps_generator -rs ENGRO2 \
15 -id metabolite_abundance.tsv \ 21 -in metabolite_data.tsv \
16 -rp output_rps.tsv \ 22 -rps rps_scores.tsv \
17 -ol log.txt 23 -ol rps_generation.log
18 ``` 24 ```
19
20 ### Galaxy Interface
21
22 Select "RPS Generator" from the COBRAxy tool suite and upload your metabolite abundance file.
23 25
24 ## Parameters 26 ## Parameters
25 27
26 ### Required Parameters
27
28 | Parameter | Flag | Description |
29 |-----------|------|-------------|
30 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
31 | Input Dataset | `-id, --input` | Metabolite abundance TSV file (rows=metabolites, cols=samples) |
32 | RPS Output | `-rp, --rps_output` | Output file path for RPS scores |
33
34 ### Optional Parameters
35
36 | Parameter | Flag | Description | Default | 28 | Parameter | Flag | Description | Default |
37 |-----------|------|-------------|---------| 29 |-----------|------|-------------|---------|
38 | Custom Reactions | `-rl, --model_upload` | Path to custom reactions file | Built-in reactions | 30 | Rules Selector | `-rs` | ENGRO2, Recon, or Custom | ENGRO2 |
39 | Output Log | `-ol, --out_log` | Log file for warnings/errors | Standard output | 31 | Input Data | `-in` | Metabolite abundance TSV file | - |
32 | Output RPS | `-rps` | Output RPS scores file | - |
33 | Output Log | `-ol` | Log file | - |
34 | Custom Rules | `-rl` | Custom reaction formulas file | - |
40 35
41 ## Input Format 36 ## Input Format
42 37
43 ### Metabolite Abundance File 38 Metabolite data file (TSV):
44
45 Tab-separated values (TSV) format:
46 39
47 ``` 40 ```
48 Metabolite Sample1 Sample2 Sample3 41 Metabolite Sample1 Sample2 Sample3
49 glucose 100.5 85.2 92.7 42 glc_c 2.5 1.8 3.2
50 pyruvate 45.3 38.9 41.2 43 atp_c 5.2 4.9 5.8
51 lactate 15.8 22.1 18.5 44 pyr_c 1.5 2.1 1.8
52 ``` 45 ```
53 46
54 **Requirements:** 47 **File Format Notes:**
55 - First column: metabolite names (case-insensitive) 48 - Use **tab-separated** values (TSV)
56 - Subsequent columns: abundance values for each sample 49 - First row must contain column headers (Metabolite, Sample names)
57 - Missing values: use 0 or leave empty 50 - Metabolite names must include compartment suffix (e.g., _c, _m, _e)
58 - File encoding: UTF-8 51 - Numeric values only for abundance data
59
60 ### Custom Reactions File (Optional)
61
62 If using custom reactions instead of built-in ones:
63
64 ```
65 ReactionID Reaction
66 R00001 glucose + ATP -> glucose-6-phosphate + ADP
67 R00002 glucose-6-phosphate <-> fructose-6-phosphate
68 ```
69 52
70 ## Output Format 53 ## Output Format
71 54
72 ### RPS Scores File
73
74 ``` 55 ```
75 Reaction Sample1 Sample2 Sample3 56 Reaction Sample1 Sample2 Sample3
76 R00001 0.85 0.72 0.79 57 R00001 1.25 0.95 1.42
77 R00002 0.45 0.38 0.52 58 R00002 0.85 1.15 0.92
78 R00003 0.12 0.28 0.21
79 ``` 59 ```
80
81 - Values range from 0 (low propensity) to 1 (high propensity)
82 - NaN values indicate insufficient metabolite data for that reaction
83
84 ## Algorithm
85
86 1. **Metabolite Matching**: Input metabolite names are matched against internal synonyms
87 2. **Abundance Normalization**: Raw abundances are normalized per sample
88 3. **Reaction Scoring**: For each reaction, RPS is computed based on:
89 - Substrate availability (geometric mean of substrate abundances)
90 - Product formation potential
91 - Stoichiometric coefficients
92 60
93 ## Examples 61 ## Examples
94 62
95 ### Basic Usage 63 ### Basic Usage
96 64
97 ```bash 65 ```bash
98 # Generate RPS from metabolite data 66 rps_generator -rs ENGRO2 \
99 rps_generator -td /opt/COBRAxy \ 67 -in metabolites.tsv \
100 -id /data/metabolomics.tsv \ 68 -rps rps_scores.tsv
101 -rp /results/rps_scores.tsv
102 ``` 69 ```
103 70
104 ### With Custom Reactions 71 ### Custom Reactions
105 72
106 ```bash 73 ```bash
107 # Use custom reaction set 74 rps_generator -rs Custom \
108 rps_generator -td /opt/COBRAxy \ 75 -rl custom_reactions.csv \
109 -id /data/metabolomics.tsv \ 76 -in metabolites.tsv \
110 -rl /custom/reactions.tsv \ 77 -rps rps_scores.tsv
111 -rp /results/custom_rps.tsv \
112 -ol /logs/rps.log
113 ``` 78 ```
114
115 ## Tips and Best Practices
116
117 ### Data Preparation
118
119 - **Metabolite Names**: Use standard nomenclature (KEGG, ChEBI, or common names)
120 - **Missing Data**: Remove samples with >50% missing metabolites
121 - **Outliers**: Consider log-transformation for highly variable metabolites
122 - **Replicates**: Average technical replicates before analysis
123
124 ### Quality Control
125
126 - Check log file for unmatched metabolite names
127 - Verify RPS score distributions (should span 0-1 range)
128 - Compare results with expected pathway activities
129
130 ### Integration with Other Tools
131
132 RPS scores are typically used with:
133 - [MAREA](marea.md) for pathway enrichment analysis
134 - [Flux to Map](flux-to-map.md) for metabolic map visualization
135 79
136 ## Troubleshooting 80 ## Troubleshooting
137 81
138 ### Common Issues 82 | Error | Solution |
139 83 |-------|----------|
140 **No RPS scores generated** 84 | "Metabolite not found" | Check metabolite nomenclature |
141 - Check metabolite name format and spelling 85 | "Invalid formula" | Verify reaction formula syntax |
142 - Verify input file has correct TSV format
143 - Ensure tool directory contains reaction databases
144
145 **Many NaN values in output**
146 - Insufficient metabolite coverage for reactions
147 - Consider using a smaller, more focused reaction set
148
149 **Memory errors**
150 - Reduce dataset size or split into batches
151 - Increase available system memory
152
153 ### Error Messages
154
155 | Error | Cause | Solution |
156 |-------|--------|----------|
157 | "File not found" | Missing input file | Check file path and permissions |
158 | "Invalid format" | Malformed TSV | Verify column headers and data types |
159 | "No metabolites matched" | Name mismatch | Check metabolite nomenclature |
160 86
161 ## See Also 87 ## See Also
162 88
163 - [RAS Generator](ras-generator.md) - Generate reaction activity scores from gene expression 89 - [MAREA](tools/marea)
164 - [MAREA](marea.md) - Statistical analysis and visualization 90 - [RAS Generator](tools/ras-generator)
165 - [Flux Simulation](flux-simulation.md) - Constraint-based modeling 91 - [Built-in Models](reference/built-in-models)