annotate COBRAxy/docs/tools/rps-generator.md @ 509:5956dcf94277 draft default tip

Uploaded
author francesco_lapi
date Wed, 01 Oct 2025 15:34:21 +0000
parents 4ed95023af20
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
1 # RPS Generator
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
3 Generate Reaction Propensity Scores (RPS) from metabolite abundance data.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
5 ## Overview
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
6
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
7 The RPS Generator computes reaction propensity scores based on metabolite abundance measurements. RPS values indicate how likely metabolic reactions are to be active based on the availability of their substrate and product metabolites.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
9 ## Usage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
10
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
11 ### Command Line
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
12
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
13 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
14 rps_generator -td /path/to/COBRAxy \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
15 -id metabolite_abundance.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
16 -rp output_rps.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
17 -ol log.txt
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
18 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
19
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
20 ### Galaxy Interface
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
21
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
22 Select "RPS Generator" from the COBRAxy tool suite and upload your metabolite abundance file.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
23
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
24 ## Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
25
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
26 ### Required Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
27
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
28 | Parameter | Flag | Description |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
29 |-----------|------|-------------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
30 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
31 | Input Dataset | `-id, --input` | Metabolite abundance TSV file (rows=metabolites, cols=samples) |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
32 | RPS Output | `-rp, --rps_output` | Output file path for RPS scores |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
33
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
34 ### Optional Parameters
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
35
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
36 | Parameter | Flag | Description | Default |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
37 |-----------|------|-------------|---------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
38 | Custom Reactions | `-rl, --model_upload` | Path to custom reactions file | Built-in reactions |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
39 | Output Log | `-ol, --out_log` | Log file for warnings/errors | Standard output |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
40
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
41 ## Input Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
42
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
43 ### Metabolite Abundance File
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
44
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
45 Tab-separated values (TSV) format:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
46
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
47 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
48 Metabolite Sample1 Sample2 Sample3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
49 glucose 100.5 85.2 92.7
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
50 pyruvate 45.3 38.9 41.2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
51 lactate 15.8 22.1 18.5
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
52 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
53
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
54 **Requirements:**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
55 - First column: metabolite names (case-insensitive)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
56 - Subsequent columns: abundance values for each sample
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
57 - Missing values: use 0 or leave empty
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
58 - File encoding: UTF-8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
59
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
60 ### Custom Reactions File (Optional)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
61
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
62 If using custom reactions instead of built-in ones:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
63
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
64 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
65 ReactionID Reaction
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
66 R00001 glucose + ATP -> glucose-6-phosphate + ADP
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
67 R00002 glucose-6-phosphate <-> fructose-6-phosphate
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
68 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
69
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
70 ## Output Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
71
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
72 ### RPS Scores File
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
73
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
74 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
75 Reaction Sample1 Sample2 Sample3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
76 R00001 0.85 0.72 0.79
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
77 R00002 0.45 0.38 0.52
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
78 R00003 0.12 0.28 0.21
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
79 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
80
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
81 - Values range from 0 (low propensity) to 1 (high propensity)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
82 - NaN values indicate insufficient metabolite data for that reaction
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
83
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
84 ## Algorithm
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
85
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
86 1. **Metabolite Matching**: Input metabolite names are matched against internal synonyms
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
87 2. **Abundance Normalization**: Raw abundances are normalized per sample
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
88 3. **Reaction Scoring**: For each reaction, RPS is computed based on:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
89 - Substrate availability (geometric mean of substrate abundances)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
90 - Product formation potential
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
91 - Stoichiometric coefficients
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
92
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
93 ## Examples
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
94
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
95 ### Basic Usage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
96
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
97 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
98 # Generate RPS from metabolite data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
99 rps_generator -td /opt/COBRAxy \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
100 -id /data/metabolomics.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
101 -rp /results/rps_scores.tsv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
102 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
103
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
104 ### With Custom Reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
105
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
106 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
107 # Use custom reaction set
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
108 rps_generator -td /opt/COBRAxy \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
109 -id /data/metabolomics.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
110 -rl /custom/reactions.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
111 -rp /results/custom_rps.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
112 -ol /logs/rps.log
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
113 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
114
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
115 ## Tips and Best Practices
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
116
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
117 ### Data Preparation
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
118
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
119 - **Metabolite Names**: Use standard nomenclature (KEGG, ChEBI, or common names)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
120 - **Missing Data**: Remove samples with >50% missing metabolites
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
121 - **Outliers**: Consider log-transformation for highly variable metabolites
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
122 - **Replicates**: Average technical replicates before analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
123
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
124 ### Quality Control
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
125
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
126 - Check log file for unmatched metabolite names
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
127 - Verify RPS score distributions (should span 0-1 range)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
128 - Compare results with expected pathway activities
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
129
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
130 ### Integration with Other Tools
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
131
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
132 RPS scores are typically used with:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
133 - [MAREA](marea.md) for pathway enrichment analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
134 - [Flux to Map](flux-to-map.md) for metabolic map visualization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
135
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
136 ## Troubleshooting
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
137
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
138 ### Common Issues
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
139
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
140 **No RPS scores generated**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
141 - Check metabolite name format and spelling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
142 - Verify input file has correct TSV format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
143 - Ensure tool directory contains reaction databases
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
144
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
145 **Many NaN values in output**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
146 - Insufficient metabolite coverage for reactions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
147 - Consider using a smaller, more focused reaction set
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
148
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
149 **Memory errors**
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
150 - Reduce dataset size or split into batches
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
151 - Increase available system memory
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
152
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
153 ### Error Messages
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
154
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
155 | Error | Cause | Solution |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
156 |-------|--------|----------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
157 | "File not found" | Missing input file | Check file path and permissions |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
158 | "Invalid format" | Malformed TSV | Verify column headers and data types |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
159 | "No metabolites matched" | Name mismatch | Check metabolite nomenclature |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
160
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
161 ## See Also
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
162
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
163 - [RAS Generator](ras-generator.md) - Generate reaction activity scores from gene expression
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
164 - [MAREA](marea.md) - Statistical analysis and visualization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
165 - [Flux Simulation](flux-simulation.md) - Constraint-based modeling