Mercurial > repos > bimib > cobraxy
comparison COBRAxy/docs/troubleshooting.md @ 492:4ed95023af20 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Tue, 30 Sep 2025 14:02:17 +0000 |
| parents | |
| children | fcdbc81feb45 |
comparison
equal
deleted
inserted
replaced
| 491:7a413a5ec566 | 492:4ed95023af20 |
|---|---|
| 1 # Troubleshooting | |
| 2 | |
| 3 Common issues and solutions when using COBRAxy. | |
| 4 | |
| 5 ## Installation Issues | |
| 6 | |
| 7 ### Python Import Errors | |
| 8 | |
| 9 **Problem**: `ModuleNotFoundError: No module named 'cobra'` | |
| 10 ```bash | |
| 11 # Solution: Install missing dependencies | |
| 12 pip install cobra pandas numpy scipy | |
| 13 | |
| 14 # Or reinstall COBRAxy | |
| 15 cd COBRAxy | |
| 16 pip install -e . | |
| 17 ``` | |
| 18 | |
| 19 **Problem**: `ImportError: No module named 'cobraxy'` | |
| 20 ```python | |
| 21 # Solution: Add COBRAxy to Python path | |
| 22 import sys | |
| 23 sys.path.insert(0, '/path/to/COBRAxy') | |
| 24 ``` | |
| 25 | |
| 26 ### System Dependencies | |
| 27 | |
| 28 **Problem**: GLPK solver not found | |
| 29 ```bash | |
| 30 # Ubuntu/Debian | |
| 31 sudo apt-get install libglpk40 glpk-utils | |
| 32 pip install swiglpk | |
| 33 | |
| 34 # macOS | |
| 35 brew install glpk | |
| 36 pip install swiglpk | |
| 37 | |
| 38 # Windows (using conda) | |
| 39 conda install -c conda-forge glpk swiglpk | |
| 40 ``` | |
| 41 | |
| 42 **Problem**: SVG processing errors | |
| 43 ```bash | |
| 44 # Install libvips for image processing | |
| 45 # Ubuntu/Debian: sudo apt-get install libvips | |
| 46 # macOS: brew install vips | |
| 47 ``` | |
| 48 | |
| 49 ## Data Format Issues | |
| 50 | |
| 51 ### Gene Expression Problems | |
| 52 | |
| 53 **Problem**: "No computable scores" error | |
| 54 ``` | |
| 55 Cause: Gene IDs don't match between data and model | |
| 56 Solution: | |
| 57 1. Check gene ID format (HGNC vs symbols vs Ensembl) | |
| 58 2. Verify first column contains gene identifiers | |
| 59 3. Ensure tab-separated format | |
| 60 4. Try different built-in model | |
| 61 ``` | |
| 62 | |
| 63 **Problem**: Many "gene not found" warnings | |
| 64 ```python | |
| 65 # Check gene overlap with model | |
| 66 import pickle | |
| 67 genes_dict = pickle.load(open('local/pickle files/ENGRO2_genes.p', 'rb')) | |
| 68 model_genes = set(genes_dict['hugo_id'].keys()) | |
| 69 | |
| 70 import pandas as pd | |
| 71 data_genes = set(pd.read_csv('expression.tsv', sep='\t').iloc[:, 0]) | |
| 72 | |
| 73 overlap = len(model_genes.intersection(data_genes)) | |
| 74 print(f"Gene overlap: {overlap}/{len(data_genes)} ({overlap/len(data_genes)*100:.1f}%)") | |
| 75 ``` | |
| 76 | |
| 77 **Problem**: File format not recognized | |
| 78 ```tsv | |
| 79 # Correct format - tab-separated: | |
| 80 Gene_ID Sample_1 Sample_2 | |
| 81 HGNC:5 10.5 11.2 | |
| 82 HGNC:10 3.2 4.1 | |
| 83 | |
| 84 # Wrong - comma-separated or spaces will fail | |
| 85 ``` | |
| 86 | |
| 87 ### Model Issues | |
| 88 | |
| 89 **Problem**: Custom model not loading | |
| 90 ``` | |
| 91 Solution: | |
| 92 1. Check TSV format with "GPR" column header | |
| 93 2. Verify reaction IDs are unique | |
| 94 3. Test GPR syntax (use 'and'/'or', proper parentheses) | |
| 95 4. Check file permissions and encoding (UTF-8) | |
| 96 ``` | |
| 97 | |
| 98 ## Tool Execution Errors | |
| 99 | |
| 100 | |
| 101 | |
| 102 ### File Path Problems | |
| 103 | |
| 104 **Problem**: "File not found" errors | |
| 105 ```python | |
| 106 # Use absolute paths | |
| 107 from pathlib import Path | |
| 108 | |
| 109 tool_dir = str(Path('/path/to/COBRAxy').absolute()) | |
| 110 input_file = str(Path('expression.tsv').absolute()) | |
| 111 | |
| 112 args = ['-td', tool_dir, '-in', input_file, ...] | |
| 113 ``` | |
| 114 | |
| 115 **Problem**: Permission denied | |
| 116 ```bash | |
| 117 # Check write permissions | |
| 118 ls -la output_directory/ | |
| 119 | |
| 120 # Fix permissions | |
| 121 chmod 755 output_directory/ | |
| 122 chmod 644 input_files/* | |
| 123 ``` | |
| 124 | |
| 125 ### Galaxy Integration Issues | |
| 126 | |
| 127 **Problem**: COBRAxy tools not appearing in Galaxy | |
| 128 ```xml | |
| 129 <!-- Check tool_conf.xml syntax --> | |
| 130 <section id="cobraxy" name="COBRAxy"> | |
| 131 <tool file="cobraxy/ras_generator.xml" /> | |
| 132 </section> | |
| 133 | |
| 134 <!-- Verify file paths are correct --> | |
| 135 ls tools/cobraxy/ras_generator.xml | |
| 136 ``` | |
| 137 | |
| 138 **Problem**: Tool execution fails in Galaxy | |
| 139 ``` | |
| 140 Check Galaxy logs: | |
| 141 - main.log: General Galaxy issues | |
| 142 - handler.log: Job execution problems | |
| 143 - uwsgi.log: Web server issues | |
| 144 | |
| 145 Common fixes: | |
| 146 1. Restart Galaxy after adding tools | |
| 147 2. Check Python environment has COBRApy installed | |
| 148 3. Verify file permissions on tool files | |
| 149 ``` | |
| 150 | |
| 151 | |
| 152 | |
| 153 **Problem**: Flux sampling hangs | |
| 154 ```bash | |
| 155 # Check solver availability | |
| 156 python -c "import cobra; print(cobra.Configuration().solver)" | |
| 157 | |
| 158 # Should show: glpk, cplex, or gurobi | |
| 159 # Install GLPK if missing: | |
| 160 pip install swiglpk | |
| 161 ``` | |
| 162 | |
| 163 ### Large Dataset Handling | |
| 164 | |
| 165 **Problem**: Cannot process large expression matrices | |
| 166 ```python | |
| 167 # Process in chunks | |
| 168 def process_large_dataset(expression_file, chunk_size=1000): | |
| 169 df = pd.read_csv(expression_file, sep='\t') | |
| 170 | |
| 171 for i in range(0, len(df), chunk_size): | |
| 172 chunk = df.iloc[i:i+chunk_size] | |
| 173 chunk_file = f'chunk_{i}.tsv' | |
| 174 chunk.to_csv(chunk_file, sep='\t', index=False) | |
| 175 | |
| 176 # Process chunk | |
| 177 ras_generator.main(['-in', chunk_file, ...]) | |
| 178 ``` | |
| 179 | |
| 180 ## Output Validation | |
| 181 | |
| 182 ### Unexpected Results | |
| 183 | |
| 184 **Problem**: All RAS values are zero or null | |
| 185 ```python | |
| 186 # Debug gene mapping | |
| 187 import pandas as pd | |
| 188 ras_df = pd.read_csv('ras_output.tsv', sep='\t', index_col=0) | |
| 189 | |
| 190 # Check data quality | |
| 191 print(f"Null percentage: {ras_df.isnull().sum().sum() / ras_df.size * 100:.1f}%") | |
| 192 print(f"Zero percentage: {(ras_df == 0).sum().sum() / ras_df.size * 100:.1f}%") | |
| 193 | |
| 194 # Check expression data preprocessing | |
| 195 expr_df = pd.read_csv('expression.tsv', sep='\t', index_col=0) | |
| 196 print(f"Expression range: {expr_df.min().min():.2f} to {expr_df.max().max():.2f}") | |
| 197 ``` | |
| 198 | |
| 199 **Problem**: RAS values seem too high/low | |
| 200 ``` | |
| 201 Possible causes: | |
| 202 1. Expression data not log-transformed | |
| 203 2. Wrong normalization method | |
| 204 3. Incorrect gene ID mapping | |
| 205 4. GPR rule interpretation issues | |
| 206 | |
| 207 Solutions: | |
| 208 1. Check expression data preprocessing | |
| 209 2. Validate against known control genes | |
| 210 3. Compare with published metabolic activity patterns | |
| 211 ``` | |
| 212 | |
| 213 ### Missing Pathway Maps | |
| 214 | |
| 215 **Problem**: MAREA generates no output maps | |
| 216 ``` | |
| 217 Debug steps: | |
| 218 1. Check RAS input has non-null values | |
| 219 2. Verify model choice matches RAS generation | |
| 220 3. Check statistical significance thresholds | |
| 221 4. Look at log files for specific errors | |
| 222 ``` | |
| 223 | |
| 224 ## Environment Issues | |
| 225 | |
| 226 ### Conda/Virtual Environment Problems | |
| 227 | |
| 228 **Problem**: Tool import fails in virtual environment | |
| 229 ```bash | |
| 230 # Activate environment properly | |
| 231 source venv/bin/activate # Linux/macOS | |
| 232 # or | |
| 233 venv\Scripts\activate # Windows | |
| 234 | |
| 235 # Verify COBRAxy installation | |
| 236 pip list | grep cobra | |
| 237 python -c "import cobra; print('COBRApy version:', cobra.__version__)" | |
| 238 ``` | |
| 239 | |
| 240 **Problem**: Version conflicts | |
| 241 ```bash | |
| 242 # Create clean environment | |
| 243 conda create -n cobraxy python=3.9 | |
| 244 conda activate cobraxy | |
| 245 | |
| 246 # Install COBRAxy fresh | |
| 247 cd COBRAxy | |
| 248 pip install -e . | |
| 249 ``` | |
| 250 | |
| 251 ### Cross-Platform Issues | |
| 252 | |
| 253 **Problem**: Windows path separator issues | |
| 254 ```python | |
| 255 # Use pathlib for cross-platform paths | |
| 256 from pathlib import Path | |
| 257 | |
| 258 # Instead of: '/path/to/file' | |
| 259 # Use: str(Path('path') / 'to' / 'file') | |
| 260 ``` | |
| 261 | |
| 262 **Problem**: Line ending issues (Windows/Unix) | |
| 263 ```bash | |
| 264 # Convert line endings if needed | |
| 265 dos2unix input_file.tsv # Unix | |
| 266 unix2dos input_file.tsv # Windows | |
| 267 ``` | |
| 268 | |
| 269 ## Debugging Strategies | |
| 270 | |
| 271 ### Enable Detailed Logging | |
| 272 | |
| 273 ```python | |
| 274 import logging | |
| 275 logging.basicConfig(level=logging.DEBUG) | |
| 276 | |
| 277 # Many tools accept log file parameter | |
| 278 args = [..., '--out_log', 'detailed.log'] | |
| 279 ``` | |
| 280 | |
| 281 ### Test with Small Datasets | |
| 282 | |
| 283 ```python | |
| 284 # Create minimal test case | |
| 285 test_data = """Gene_ID Sample1 Sample2 | |
| 286 HGNC:5 10.0 15.0 | |
| 287 HGNC:10 5.0 8.0""" | |
| 288 | |
| 289 with open('test_input.tsv', 'w') as f: | |
| 290 f.write(test_data) | |
| 291 | |
| 292 # Test basic functionality | |
| 293 ras_generator.main(['-td', tool_dir, '-in', 'test_input.tsv', | |
| 294 '-ra', 'test_output.tsv', '-rs', 'ENGRO2']) | |
| 295 ``` | |
| 296 | |
| 297 ### Check Dependencies | |
| 298 | |
| 299 ```python | |
| 300 # Verify all required packages | |
| 301 required_packages = ['cobra', 'pandas', 'numpy', 'scipy'] | |
| 302 | |
| 303 for package in required_packages: | |
| 304 try: | |
| 305 __import__(package) | |
| 306 print(f"✓ {package}") | |
| 307 except ImportError: | |
| 308 print(f"✗ {package} - MISSING") | |
| 309 ``` | |
| 310 | |
| 311 ## Getting Help | |
| 312 | |
| 313 ### Information to Include in Bug Reports | |
| 314 | |
| 315 When reporting issues, include: | |
| 316 | |
| 317 1. **System information**: | |
| 318 ```bash | |
| 319 python --version | |
| 320 pip list | grep cobra | |
| 321 uname -a # Linux/macOS | |
| 322 ``` | |
| 323 | |
| 324 2. **Complete error messages**: Copy full traceback | |
| 325 3. **Input file format**: First few lines of input data | |
| 326 4. **Command/parameters used**: Exact command or Python code | |
| 327 5. **Expected vs actual behavior**: What should happen vs what happens | |
| 328 | |
| 329 ### Community Resources | |
| 330 | |
| 331 - **GitHub Issues**: [Report bugs](https://github.com/CompBtBs/COBRAxy/issues) | |
| 332 - **Discussions**: [Ask questions](https://github.com/CompBtBs/COBRAxy/discussions) | |
| 333 - **COBRApy Community**: [General metabolic modeling help](https://github.com/opencobra/cobrapy) | |
| 334 | |
| 335 ### Self-Help Checklist | |
| 336 | |
| 337 Before reporting issues: | |
| 338 | |
| 339 - ✅ Checked this troubleshooting guide | |
| 340 - ✅ Verified installation completeness | |
| 341 - ✅ Tested with built-in example data | |
| 342 - ✅ Searched existing GitHub issues | |
| 343 - ✅ Tried alternative models/parameters | |
| 344 - ✅ Checked file formats and permissions | |
| 345 | |
| 346 ## Prevention Tips | |
| 347 | |
| 348 ### Best Practices | |
| 349 | |
| 350 1. **Use virtual environments** to avoid conflicts | |
| 351 2. **Validate input data** before processing | |
| 352 3. **Start with small datasets** for testing | |
| 353 4. **Keep backups** of working configurations | |
| 354 5. **Document successful workflows** for reuse | |
| 355 6. **Test after updates** to catch regressions | |
| 356 | |
| 357 ### Data Quality Checks | |
| 358 | |
| 359 ```python | |
| 360 def validate_expression_data(filename): | |
| 361 """Validate gene expression file format.""" | |
| 362 df = pd.read_csv(filename, sep='\t') | |
| 363 | |
| 364 # Check basic format | |
| 365 assert df.shape[0] > 0, "Empty file" | |
| 366 assert df.shape[1] > 1, "Need at least 2 columns" | |
| 367 | |
| 368 # Check numeric data | |
| 369 numeric_cols = df.select_dtypes(include=[np.number]).columns | |
| 370 assert len(numeric_cols) > 0, "No numeric expression data" | |
| 371 | |
| 372 # Check for missing values | |
| 373 null_pct = df.isnull().sum().sum() / df.size * 100 | |
| 374 if null_pct > 50: | |
| 375 print(f"Warning: {null_pct:.1f}% missing values") | |
| 376 | |
| 377 print(f"✓ File valid: {df.shape[0]} genes × {df.shape[1]-1} samples") | |
| 378 ``` | |
| 379 | |
| 380 This troubleshooting guide covers the most common issues. For tool-specific problems, check the individual tool documentation pages. |
