Mercurial > repos > bimib > cobraxy
comparison COBRAxy/docs/tools/rps-generator.md @ 547:73f2f7e2be17 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Tue, 28 Oct 2025 10:44:07 +0000 |
| parents | 4ed95023af20 |
| children |
comparison
equal
deleted
inserted
replaced
| 546:01147e83f43c | 547:73f2f7e2be17 |
|---|---|
| 1 # RPS Generator | 1 # RPS Generator |
| 2 | 2 |
| 3 Generate Reaction Propensity Scores (RPS) from metabolite abundance data. | 3 Compute Reaction Presence Scores (RPS) from metabolite abundance data. |
| 4 | 4 |
| 5 ## Overview | 5 ## Overview |
| 6 | 6 |
| 7 The RPS Generator computes reaction propensity scores based on metabolite abundance measurements. RPS values indicate how likely metabolic reactions are to be active based on the availability of their substrate and product metabolites. | 7 RPS Generator calculates reaction presence scores based on metabolite availability in reaction formulas. |
| 8 | |
| 9 ## Galaxy Interface | |
| 10 | |
| 11 In Galaxy: **COBRAxy → RPS Generator** | |
| 12 | |
| 13 1. Select built-in model or upload custom reactions | |
| 14 2. Upload metabolite abundance data | |
| 15 3. Click **Execute** | |
| 8 | 16 |
| 9 ## Usage | 17 ## Usage |
| 10 | 18 |
| 11 ### Command Line | |
| 12 | |
| 13 ```bash | 19 ```bash |
| 14 rps_generator -td /path/to/COBRAxy \ | 20 rps_generator -rs ENGRO2 \ |
| 15 -id metabolite_abundance.tsv \ | 21 -in metabolite_data.tsv \ |
| 16 -rp output_rps.tsv \ | 22 -rps rps_scores.tsv \ |
| 17 -ol log.txt | 23 -ol rps_generation.log |
| 18 ``` | 24 ``` |
| 19 | |
| 20 ### Galaxy Interface | |
| 21 | |
| 22 Select "RPS Generator" from the COBRAxy tool suite and upload your metabolite abundance file. | |
| 23 | 25 |
| 24 ## Parameters | 26 ## Parameters |
| 25 | 27 |
| 26 ### Required Parameters | |
| 27 | |
| 28 | Parameter | Flag | Description | | |
| 29 |-----------|------|-------------| | |
| 30 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory | | |
| 31 | Input Dataset | `-id, --input` | Metabolite abundance TSV file (rows=metabolites, cols=samples) | | |
| 32 | RPS Output | `-rp, --rps_output` | Output file path for RPS scores | | |
| 33 | |
| 34 ### Optional Parameters | |
| 35 | |
| 36 | Parameter | Flag | Description | Default | | 28 | Parameter | Flag | Description | Default | |
| 37 |-----------|------|-------------|---------| | 29 |-----------|------|-------------|---------| |
| 38 | Custom Reactions | `-rl, --model_upload` | Path to custom reactions file | Built-in reactions | | 30 | Rules Selector | `-rs` | ENGRO2, Recon, or Custom | ENGRO2 | |
| 39 | Output Log | `-ol, --out_log` | Log file for warnings/errors | Standard output | | 31 | Input Data | `-in` | Metabolite abundance TSV file | - | |
| 32 | Output RPS | `-rps` | Output RPS scores file | - | | |
| 33 | Output Log | `-ol` | Log file | - | | |
| 34 | Custom Rules | `-rl` | Custom reaction formulas file | - | | |
| 40 | 35 |
| 41 ## Input Format | 36 ## Input Format |
| 42 | 37 |
| 43 ### Metabolite Abundance File | 38 Metabolite data file (TSV): |
| 44 | |
| 45 Tab-separated values (TSV) format: | |
| 46 | 39 |
| 47 ``` | 40 ``` |
| 48 Metabolite Sample1 Sample2 Sample3 | 41 Metabolite Sample1 Sample2 Sample3 |
| 49 glucose 100.5 85.2 92.7 | 42 glc_c 2.5 1.8 3.2 |
| 50 pyruvate 45.3 38.9 41.2 | 43 atp_c 5.2 4.9 5.8 |
| 51 lactate 15.8 22.1 18.5 | 44 pyr_c 1.5 2.1 1.8 |
| 52 ``` | 45 ``` |
| 53 | 46 |
| 54 **Requirements:** | 47 **File Format Notes:** |
| 55 - First column: metabolite names (case-insensitive) | 48 - Use **tab-separated** values (TSV) |
| 56 - Subsequent columns: abundance values for each sample | 49 - First row must contain column headers (Metabolite, Sample names) |
| 57 - Missing values: use 0 or leave empty | 50 - Metabolite names must include compartment suffix (e.g., _c, _m, _e) |
| 58 - File encoding: UTF-8 | 51 - Numeric values only for abundance data |
| 59 | |
| 60 ### Custom Reactions File (Optional) | |
| 61 | |
| 62 If using custom reactions instead of built-in ones: | |
| 63 | |
| 64 ``` | |
| 65 ReactionID Reaction | |
| 66 R00001 glucose + ATP -> glucose-6-phosphate + ADP | |
| 67 R00002 glucose-6-phosphate <-> fructose-6-phosphate | |
| 68 ``` | |
| 69 | 52 |
| 70 ## Output Format | 53 ## Output Format |
| 71 | 54 |
| 72 ### RPS Scores File | |
| 73 | |
| 74 ``` | 55 ``` |
| 75 Reaction Sample1 Sample2 Sample3 | 56 Reaction Sample1 Sample2 Sample3 |
| 76 R00001 0.85 0.72 0.79 | 57 R00001 1.25 0.95 1.42 |
| 77 R00002 0.45 0.38 0.52 | 58 R00002 0.85 1.15 0.92 |
| 78 R00003 0.12 0.28 0.21 | |
| 79 ``` | 59 ``` |
| 80 | |
| 81 - Values range from 0 (low propensity) to 1 (high propensity) | |
| 82 - NaN values indicate insufficient metabolite data for that reaction | |
| 83 | |
| 84 ## Algorithm | |
| 85 | |
| 86 1. **Metabolite Matching**: Input metabolite names are matched against internal synonyms | |
| 87 2. **Abundance Normalization**: Raw abundances are normalized per sample | |
| 88 3. **Reaction Scoring**: For each reaction, RPS is computed based on: | |
| 89 - Substrate availability (geometric mean of substrate abundances) | |
| 90 - Product formation potential | |
| 91 - Stoichiometric coefficients | |
| 92 | 60 |
| 93 ## Examples | 61 ## Examples |
| 94 | 62 |
| 95 ### Basic Usage | 63 ### Basic Usage |
| 96 | 64 |
| 97 ```bash | 65 ```bash |
| 98 # Generate RPS from metabolite data | 66 rps_generator -rs ENGRO2 \ |
| 99 rps_generator -td /opt/COBRAxy \ | 67 -in metabolites.tsv \ |
| 100 -id /data/metabolomics.tsv \ | 68 -rps rps_scores.tsv |
| 101 -rp /results/rps_scores.tsv | |
| 102 ``` | 69 ``` |
| 103 | 70 |
| 104 ### With Custom Reactions | 71 ### Custom Reactions |
| 105 | 72 |
| 106 ```bash | 73 ```bash |
| 107 # Use custom reaction set | 74 rps_generator -rs Custom \ |
| 108 rps_generator -td /opt/COBRAxy \ | 75 -rl custom_reactions.csv \ |
| 109 -id /data/metabolomics.tsv \ | 76 -in metabolites.tsv \ |
| 110 -rl /custom/reactions.tsv \ | 77 -rps rps_scores.tsv |
| 111 -rp /results/custom_rps.tsv \ | |
| 112 -ol /logs/rps.log | |
| 113 ``` | 78 ``` |
| 114 | |
| 115 ## Tips and Best Practices | |
| 116 | |
| 117 ### Data Preparation | |
| 118 | |
| 119 - **Metabolite Names**: Use standard nomenclature (KEGG, ChEBI, or common names) | |
| 120 - **Missing Data**: Remove samples with >50% missing metabolites | |
| 121 - **Outliers**: Consider log-transformation for highly variable metabolites | |
| 122 - **Replicates**: Average technical replicates before analysis | |
| 123 | |
| 124 ### Quality Control | |
| 125 | |
| 126 - Check log file for unmatched metabolite names | |
| 127 - Verify RPS score distributions (should span 0-1 range) | |
| 128 - Compare results with expected pathway activities | |
| 129 | |
| 130 ### Integration with Other Tools | |
| 131 | |
| 132 RPS scores are typically used with: | |
| 133 - [MAREA](marea.md) for pathway enrichment analysis | |
| 134 - [Flux to Map](flux-to-map.md) for metabolic map visualization | |
| 135 | 79 |
| 136 ## Troubleshooting | 80 ## Troubleshooting |
| 137 | 81 |
| 138 ### Common Issues | 82 | Error | Solution | |
| 139 | 83 |-------|----------| |
| 140 **No RPS scores generated** | 84 | "Metabolite not found" | Check metabolite nomenclature | |
| 141 - Check metabolite name format and spelling | 85 | "Invalid formula" | Verify reaction formula syntax | |
| 142 - Verify input file has correct TSV format | |
| 143 - Ensure tool directory contains reaction databases | |
| 144 | |
| 145 **Many NaN values in output** | |
| 146 - Insufficient metabolite coverage for reactions | |
| 147 - Consider using a smaller, more focused reaction set | |
| 148 | |
| 149 **Memory errors** | |
| 150 - Reduce dataset size or split into batches | |
| 151 - Increase available system memory | |
| 152 | |
| 153 ### Error Messages | |
| 154 | |
| 155 | Error | Cause | Solution | | |
| 156 |-------|--------|----------| | |
| 157 | "File not found" | Missing input file | Check file path and permissions | | |
| 158 | "Invalid format" | Malformed TSV | Verify column headers and data types | | |
| 159 | "No metabolites matched" | Name mismatch | Check metabolite nomenclature | | |
| 160 | 86 |
| 161 ## See Also | 87 ## See Also |
| 162 | 88 |
| 163 - [RAS Generator](ras-generator.md) - Generate reaction activity scores from gene expression | 89 - [MAREA](tools/marea) |
| 164 - [MAREA](marea.md) - Statistical analysis and visualization | 90 - [RAS Generator](tools/ras-generator) |
| 165 - [Flux Simulation](flux-simulation.md) - Constraint-based modeling | 91 - [Built-in Models](reference/built-in-models) |
