view test-data/README.test_db @ 0:955e33326e20 draft

planemo upload for repository https://github.com/Helmholtz-UFZ/ufz-galaxy-tools/blob/main/tools/longorf/ commit 483ade5362574a59ddc87e3788334bcbff253805
author ufz
date Tue, 18 Jun 2024 14:28:44 +0000
parents
children
line wrap: on
line source

creating of a smaller reference database: https://github.com/apcamargo/genomad/issues/104#issuecomment-2170949010

- Download reference db v1.7
- store in dir genomad_db in test-data
- run test and get ids with `awk -v FS="\t" 'NR>1 && $9!="NA" {print $9}' output/sequence_annotate/sequence_genes.tsv | sort -u > markers
- join -1 2 -2 1 genomad_db/genomad_db.lookup markers | cut -d" " -f 2 | sort -u -n > sorted_markers
- cd genomad_db 
- `mmseqs createsubdb ~/projects/tools-iuc/tools/genomad/test-data/sorted_markers genomad_db genomad_microdb`
- mv genomad_microdb.index genomad_db.index
- mv genomad_microdb.dbtype genomad_db.dbtype
- mv genomad_microdb genomad_db
- genomad_microdb*