comparison COBRAxy/docs/tools/marea.md @ 547:73f2f7e2be17 draft

Uploaded
author francesco_lapi
date Tue, 28 Oct 2025 10:44:07 +0000
parents fcdbc81feb45
children
comparison
equal deleted inserted replaced
546:01147e83f43c 547:73f2f7e2be17
1 # MAREA 1 # MAREA
2 2
3 Metabolic Reaction Enrichment Analysis for statistical comparison and map visualization. 3 Metabolic Enrichment Analysis and Visualization.
4 4
5 ## Overview 5 ## Overview
6 6
7 MAREA performs statistical enrichment analysis on RAS/RPS data to identify significantly different metabolic reactions between sample groups. It generates enriched pathway maps with color-coded reactions showing statistical significance and fold changes. 7 MAREA performs statistical comparison of metabolic scores (RAS/RPS) and visualizes results on pathway maps.
8 8
9 ## Usage 9 ## Galaxy Interface
10 10
11 ### Command Line 11 In Galaxy: **COBRAxy → Metabolic Reaction Enrichment Analysis**
12
13 1. Upload RAS/RPS scores and sample class file
14 2. Select map and configure statistical parameters
15 3. Click **Run tool**
16
17 ## Command-line console
12 18
13 ```bash 19 ```bash
14 marea -td /path/to/COBRAxy \ 20 marea -input_data scores.tsv \
15 -using_RAS true \ 21 -input_class classes.csv \
16 -input_data ras_data.tsv \ 22 -choice_map ENGRO2 \
17 -input_class class_labels.tsv \
18 -comparison manyvsmany \ 23 -comparison manyvsmany \
19 -test ks \ 24 -pvalue 0.05 \
20 -pv 0.05 \ 25 -idop output/
21 -fc 1.5 \ 26 ```
27
28 ## Parameters
29
30 | Parameter | Flag | Description | Default |
31 |-----------|------|-------------|---------|
32 | Input Data | `-input_data` | RAS/RPS scores file | - |
33 | Input Class | `-input_class` | Sample class definitions | - |
34 | Map Choice | `-choice_map` | ENGRO2, Recon, or Custom | ENGRO2 |
35 | Custom Map | `-custom_map` | Path to custom SVG map | - |
36 | Comparison | `-comparison` | manyvsmany, onevsrest, onevsmany | manyvsmany |
37 | P-value | `-pvalue` | Significance threshold | 0.05 |
38 | FDR Correction | `-fdr` | Apply FDR correction | true |
39 | Test Type | `-test_type` | t, wilcoxon, ks, DESeq | t |
40 | Net RPS | `--net` | Use net contribution for reversible reactions (RPS only) | false |
41 | Output Path | `-idop` | Output directory | marea/ |
42
43 ## Input Formats
44
45 ### Metabolic Scores
46
47 ```
48 Reaction Sample1 Sample2 Sample3
49 R00001 1.25 0.85 1.42
50 R00002 0.65 1.35 0.72
51 ```
52
53 ### Sample Classes
54
55 ```
56 SampleID Class
57 Sample1 Control
58 Sample2 Treatment
59 Sample3 Treatment
60 ```
61
62 **File Format Notes:**
63 - Use **tab-separated** values (TSV) or **comma-separated** (CSV)
64 - First row must contain column headers
65 - Sample names must match between scores and class file
66 - Class names should not contain spaces
67
68 ## Statistical Tests
69
70 - **t**: Student's t-test (parametric, assumes normality)
71 - **wilcoxon**: Wilcoxon/Mann-Whitney (non-parametric)
72 - **ks**: Kolmogorov-Smirnov (distribution-free)
73 - **DESeq**: DESeq2-like test (**RAS only**, requires ≥2 replicates per sample)
74
75 ## Comparison Types
76
77 - **manyvsmany**: All pairwise comparisons
78 - **onevsrest**: Each class vs all others
79 - **onevsmany**: One reference vs multiple classes
80
81 ## Advanced Options
82
83 ### Net RPS Values
84
85 When analyzing RPS data with reversible reactions, the `--net` parameter controls arrow coloring:
86
87 **When `--net false` (default):**
88 - Each direction of a reversible reaction colored independently
89 - Forward and backward contributions shown separately
90
91 **When `--net true` (RPS only):**
92 - Arrow tips colored with net contribution of both directions
93 - Useful for visualizing overall metabolite flow direction
94
95 **Note**: This option only applies to RPS analysis and affects visualization of reversible reactions on metabolic maps.
96
97 ## Output
98
99 - `*_map.svg`: Annotated pathway maps
100 - `comparison_results.tsv`: Statistical results
101 - `*.log`: Processing log
102
103 ## Examples
104
105 ### Basic Analysis
106
107 ```bash
108 marea -input_data ras_scores.tsv \
109 -input_class classes.csv \
22 -choice_map ENGRO2 \ 110 -choice_map ENGRO2 \
23 -generate_svg true \ 111 -comparison manyvsmany \
112 -pvalue 0.05 \
24 -idop results/ 113 -idop results/
25 ``` 114 ```
26 115
27 ### Galaxy Interface 116 ### Custom Map
28
29 Select "MAREA" from the COBRAxy tool suite and configure analysis parameters through the web interface.
30
31 ## Parameters
32
33 ### Required Parameters
34
35 | Parameter | Flag | Description |
36 |-----------|------|-------------|
37 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
38
39 ### Data Input Parameters
40
41 | Parameter | Flag | Description | Default |
42 |-----------|------|-------------|---------|
43 | Use RAS | `-using_RAS, --using_RAS` | Include RAS analysis | true |
44 | RAS Data | `-input_data, --input_data` | RAS scores TSV file | - |
45 | RAS Classes | `-input_class, --input_class` | Sample group labels | - |
46 | Multiple RAS | `-input_datas, --input_datas` | Multiple RAS files (space-separated) | - |
47 | RAS Names | `-names, --names` | Names for multiple datasets | - |
48 | Use RPS | `-using_RPS, --using_RPS` | Include RPS analysis | false |
49 | RPS Data | `-input_data_rps, --input_data_rps` | RPS scores TSV file | - |
50 | RPS Classes | `-input_class_rps, --input_class_rps` | RPS sample groups | - |
51
52 ### Statistical Parameters
53
54 | Parameter | Flag | Description | Default |
55 |-----------|------|-------------|---------|
56 | Comparison Type | `-co, --comparison` | Statistical comparison mode | manyvsmany |
57 | Statistical Test | `-te, --test` | Statistical test method | ks |
58 | P-Value Threshold | `-pv, --pValue` | Significance threshold | 0.1 |
59 | Adjusted P-values | `-adj, --adjusted` | Apply FDR correction | false |
60 | Fold Change | `-fc, --fChange` | Minimum fold change | 1.5 |
61 | Net Enrichment | `-ne, --net` | Use net enrichment for RPS | false |
62 | Analysis Option | `-op, --option` | Analysis mode | datasets |
63
64 ### Visualization Parameters
65
66 | Parameter | Flag | Description | Default |
67 |-----------|------|-------------|---------|
68 | Map Choice | `-choice_map, --choice_map` | Built-in metabolic map | - |
69 | Custom Map | `-custom_map, --custom_map` | Path to custom SVG map | - |
70 | Generate SVG | `-gs, --generate_svg` | Create SVG output | true |
71 | Generate PDF | `-gp, --generate_pdf` | Create PDF output | false |
72 | Generate PNG | `-gpng, --generate_png` | Create PNG output | false |
73 | Color Map | `-colorm, --color_map` | Color scheme (jet/viridis) | jet |
74 | Output Directory | `-idop, --output_path` | Results directory | result/ |
75
76 ### Advanced Parameters
77
78 | Parameter | Flag | Description | Default |
79 |-----------|------|-------------|---------|
80 | Output Log | `-ol, --out_log` | Log file path | - |
81 | Control Sample | `-on, --control` | Control group identifier | - |
82
83 ## Input Formats
84
85 ### RAS/RPS Data File
86
87 Tab-separated format with reactions as rows and samples as columns:
88
89 ```
90 Reaction Sample1 Sample2 Sample3 Sample4
91 R00001 1.25 0.85 1.42 0.78
92 R00002 0.65 1.35 0.72 1.28
93 R00003 2.15 2.05 0.45 0.52
94 ```
95
96 ### Class Labels File
97
98 Sample grouping information:
99
100 ```
101 Sample Class
102 Sample1 Control
103 Sample2 Treatment
104 Sample3 Control
105 Sample4 Treatment
106 ```
107
108 ## Comparison Types
109
110 ### manyvsmany
111 Compare all possible pairs of groups:
112 - Group A vs Group B
113 - Group A vs Group C
114 - Group B vs Group C
115
116 ### onevsrest
117 Compare each group against all others combined:
118 - Group A vs (Group B + Group C)
119 - Group B vs (Group A + Group C)
120
121 ### onevsmany
122 Compare one specific group against each other group separately:
123 - Control vs Treatment1
124 - Control vs Treatment2
125
126 ## Statistical Tests
127
128 | Test | Description | Use Case |
129 |------|-------------|----------|
130 | `ks` | Kolmogorov-Smirnov | Non-parametric, distribution-free |
131 | `ttest_p` | Paired t-test | Related samples |
132 | `ttest_ind` | Independent t-test | Unrelated samples |
133 | `wilcoxon` | Wilcoxon signed-rank | Non-parametric paired |
134 | `mw` | Mann-Whitney U | Non-parametric independent |
135 | `DESeq` | DESeq2-style analysis | Count-like data with dispersion |
136
137 ## Output Files
138
139 ### Statistical Results
140 - `comparison_stats.tsv`: P-values, fold changes, and test statistics
141 - `enriched_reactions.tsv`: Significantly enriched reactions only
142 - `comparison_summary.txt`: Analysis summary and parameters
143
144 ### Visualization
145 - `pathway_map.svg`: Color-coded metabolic map
146 - `pathway_map.pdf`: PDF version (if requested)
147 - `pathway_map.png`: PNG version (if requested)
148 - `legend.svg`: Color scale and significance indicators
149
150 ## Examples
151
152 ### Basic RAS Analysis
153 117
154 ```bash 118 ```bash
155 # Simple two-group comparison 119 marea -input_data rps_scores.tsv \
156 marea -td /opt/COBRAxy \ 120 -input_class classes.csv \
157 -using_RAS true \ 121 -choice_map Custom \
158 -input_data ras_scores.tsv \ 122 -custom_map pathway.svg \
159 -input_class sample_groups.tsv \ 123 -comparison onevsrest \
160 -comparison manyvsmany \
161 -test ks \
162 -pv 0.05 \
163 -choice_map ENGRO2 \
164 -idop results/ 124 -idop results/
165 ``` 125 ```
166 126
167 ### Combined RAS + RPS Analysis 127 ### Non-parametric Test
168 128
169 ```bash 129 ```bash
170 # Multi-modal analysis 130 marea -input_data scores.tsv \
171 marea -td /opt/COBRAxy \ 131 -input_class classes.csv \
172 -using_RAS true \ 132 -choice_map ENGRO2 \
173 -input_data ras_scores.tsv \ 133 -test_type wilcoxon \
174 -input_class ras_groups.tsv \ 134 -pvalue 0.01 \
175 -using_RPS true \ 135 -fdr true \
176 -input_data_rps rps_scores.tsv \ 136 -idop results/
177 -input_class_rps rps_groups.tsv \
178 -comparison onevsrest \
179 -test DESeq \
180 -adj true \
181 -fc 2.0 \
182 -choice_map HMRcore \
183 -generate_pdf true \
184 -idop multimodal_results/
185 ``` 137 ```
186
187 ### Multiple Dataset Analysis
188
189 ```bash
190 # Compare multiple experiments
191 marea -td /opt/COBRAxy \
192 -using_RAS true \
193 -input_datas exp1_ras.tsv exp2_ras.tsv exp3_ras.tsv \
194 -names "Experiment1" "Experiment2" "Experiment3" \
195 -comparison onevsmany \
196 -test wilcoxon \
197 -pv 0.01 \
198 -custom_map custom_pathway.svg \
199 -idop multi_experiment/
200 ```
201
202 ## Map Visualization
203
204 ### Built-in Maps
205 - **ENGRO2**: Human genome-scale reconstruction
206 - **HMRcore**: Core human metabolic network
207 - **Recon**: Recon3D human model
208
209 ### Color Coding
210 - **Red**: Significantly upregulated (high activity)
211 - **Blue**: Significantly downregulated (low activity)
212 - **Gray**: Not significant
213 - **Line Width**: Proportional to fold change magnitude
214
215 ### Custom Maps
216 SVG files with reaction elements having IDs matching your data:
217 ```xml
218 <rect id="R00001" class="reaction" .../>
219 <path id="R00002" class="reaction" .../>
220 ```
221
222 ## Quality Control
223
224 ### Pre-analysis Checks
225 - Verify sample names match between data and class files
226 - Check for missing values and outliers
227 - Ensure adequate sample sizes per group (n ≥ 3 recommended)
228
229 ### Post-analysis Validation
230 - Review statistical distribution assumptions
231 - Check multiple testing correction effects
232 - Validate biological relevance of enriched pathways
233
234 ## Tips and Best Practices
235
236 ### Statistical Considerations
237 - Use FDR correction (`-adj true`) for multiple comparisons
238 - Choose appropriate tests based on data distribution
239 - Consider effect size alongside significance
240
241 ### Visualization Optimization
242 - Use high fold change thresholds (>2.0) for cleaner maps
243 - Export both SVG (editable) and PDF (publication-ready) formats
244 - Adjust color schemes for colorblind accessibility
245 138
246 ## Troubleshooting 139 ## Troubleshooting
247 140
248 ### Common Issues 141 | Error | Solution |
249 142 |-------|----------|
250 **No significant reactions found** 143 | "No matching reactions" | Verify reaction IDs |
251 - Lower p-value threshold (`-pv 0.1`) 144 | "Insufficient samples" | Increase sample sizes per group |
252 - Reduce fold change requirement (`-fc 1.2`)
253 - Check sample group definitions
254
255 **Map rendering errors**
256 - Verify SVG map file integrity
257 - Check reaction ID matching between data and map
258 - Ensure sufficient system memory for large maps
259
260 **Statistical test failures**
261 - Verify data normality for parametric tests
262 - Check for sufficient sample sizes
263 - Consider alternative test methods
264
265 ## Integration
266
267 ### Upstream Tools
268 - [RAS Generator](ras-generator.md) - Generate RAS scores
269 - [RPS Generator](rps-generator.md) - Generate RPS scores
270
271 ### Downstream Analysis
272 - [Flux to Map](flux-to-map.md) - Additional visualization options
273 - [MAREA Cluster](marea-cluster.md) - Sample clustering analysis
274 145
275 ## See Also 146 ## See Also
276 147
277 - [Statistical Tests Documentation](/tutorials/statistical-tests.md) 148 - [RAS Generator](tools/ras-generator)
278 - [Map Customization Guide](/tutorials/custom-maps.md) 149 - [RPS Generator](tools/rps-generator)
279 - [Multi-modal Analysis Tutorial](/tutorials/multimodal-analysis.md) 150 - [Flux to Map](tools/flux-to-map)
151 - [Built-in Models](reference/built-in-models)