comparison egapx_runner.xml @ 8:1680e72e27be draft default tip

planemo upload for repository https://github.com/ncbi/egapx commit bdbe05027c2c40e217a2ff0c9e0556450c443e54
author fubar
date Mon, 05 Aug 2024 03:56:41 +0000
parents 9c778770514f
children
comparison
equal deleted inserted replaced
7:9c778770514f 8:1680e72e27be
1 <tool name="egapx_runner" id="egapx_runner" version="6.0.1" profile="22.05"> 1 <tool name="egapx_runner" id="egapx_runner" version="@TOOL_VERSION@" profile="22.05">
2 <description>Runs egapx</description> 2 <description>Runs egapx</description>
3 <macros>
4 <token name="@TOOL_VERSION@">0.02-alpha</token>
5 </macros>
3 <requirements> 6 <requirements>
4 <requirement version="3.12.3" type="package">python</requirement> 7 <requirement version="3.12.3" type="package">python</requirement>
5 <requirement version="24.04.4-0" type="package">nextflow</requirement> 8 <requirement version="24.04.4-0" type="package">nextflow</requirement>
6 <requirement version="6.0.1" type="package">pyyaml</requirement> 9 <requirement version="6.0.1" type="package">pyyaml</requirement>
7 </requirements> 10 </requirements>
8 <version_command><![CDATA[echo "6.0.1"]]></version_command> 11 <version_command><![CDATA[echo "@TOOL_VERSION@"]]></version_command>
9 <command><![CDATA[mkdir -p ./egapx_config && 12 <command><![CDATA[mkdir -p ./egapx_config &&
10 #set econfigfile = $econfig + '.config' 13 #set econfigfile = $econfig + '.config'
11 cp '$__tool_directory__/ui/assets/config/executor/$econfigfile' ./egapx_config/ && 14 cp '$__tool_directory__/ui/assets/config/executor/$econfigfile' ./egapx_config/ &&
12 python '$__tool_directory__/ui/egapx.py' '$yamlconfig' -e '$econfig' -o 'egapx_out']]></command> 15 python '$__tool_directory__/ui/egapx.py' '$yamlconfig' -e '$econfig' -o 'egapx_out']]></command>
13 <inputs> 16 <inputs>
71 74
72 YAML sample configurations can be uploaded into your Galaxy history from the `EGAPx github repository <https://github.com/ncbi/egapx/tree/main/examples/>`_. 75 YAML sample configurations can be uploaded into your Galaxy history from the `EGAPx github repository <https://github.com/ncbi/egapx/tree/main/examples/>`_.
73 The simplest possible example is shown below - can be cut/paste into a history dataset in the upload tool. 76 The simplest possible example is shown below - can be cut/paste into a history dataset in the upload tool.
74 77
75 78
76 *./examples/input_D_farinae_small.yaml* is included in the examples linked above. RNA-seq data is provided as URI to the reads FASTA files. 79 *./examples/input_D_farinae_small.yaml* is shown below and can be cut and pasted into the upload form to create a yaml file.
77 These FASTA files are a sampling of the reads from the complete SRA read files to expedite testing. 80 RNA-seq data is provided as URI to the reads FASTA files.
81
82 input_D_farinae_small.yaml
78 83
79 :: 84 ::
80 85
81 genome: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/020/809/275/GCF_020809275.1_ASM2080927v1/GCF_020809275.1_ASM2080927v1_genomic.fna.gz 86 genome: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/020/809/275/GCF_020809275.1_ASM2080927v1/GCF_020809275.1_ASM2080927v1_genomic.fna.gz
82 taxid: 6954 87 taxid: 6954
85 - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR8506572.2 90 - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR8506572.2
86 - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR9005248.1 91 - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR9005248.1
87 - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR9005248.2 92 - https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/EGAP/data/Dermatophagoides_farinae_small/SRR9005248.2
88 93
89 94
90 95 input_Gavia_stellata.yaml
96
97 ::
98
99 genome: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/030/936/135/GCF_030936135.1_bGavSte3.hap2/GCF_030936135.1_bGavSte3.hap2_genomic.fna.gz
100 reads: txid37040[Organism] AND biomol_transcript[properties] NOT SRS024887[Accession]
101 taxid: 37040
102
103 input_C_longicornis.yaml
104
105 ::
106
107 genome: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/029//603/195/GCF_029603195.1_ASM2960319v2/GCF_029603195.1_ASM2960319v2_genomic.fna.gz
108 reads: txid2530218[Organism] AND biomol_transcript[properties] NOT SRS024887[Accession]
109 taxid: 2530218
110
91 Purpose 111 Purpose
92 ======== 112 ========
93 113
94 **This is not intended for production** 114 **This is not intended for production**
95 115
107 **Warning:** 127 **Warning:**
108 The current version is an alpha release with limited features and organism scope to collect initial feedback on execution. Outputs are not yet complete and not intended for production use. Please open a GitHub [Issue](https://github.com/ncbi/egapx/issues) if you encounter any problems with EGAPx. You can also write to cgr@nlm.nih.gov to give us your feedback or if you have any questions. 128 The current version is an alpha release with limited features and organism scope to collect initial feedback on execution. Outputs are not yet complete and not intended for production use. Please open a GitHub [Issue](https://github.com/ncbi/egapx/issues) if you encounter any problems with EGAPx. You can also write to cgr@nlm.nih.gov to give us your feedback or if you have any questions.
109 129
110 EGAPx is the publicly accessible version of the updated NCBI [Eukaryotic Genome Annotation Pipeline](https://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/). 130 EGAPx is the publicly accessible version of the updated NCBI [Eukaryotic Genome Annotation Pipeline](https://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/).
111 131
112 EGAPx takes an assembly fasta file, a taxid of the organism, and RNA-seq data. Based on the taxid, EGAPx will pick protein sets and HMM models. The pipeline runs `miniprot` to align protein sequences, and `STAR` to align RNA-seq to the assembly. Protein alignments and RNA-seq read alignments are then passed to `Gnomon` for gene prediction. In the first step of `Gnomon`, the short alignments are chained together into putative gene models. In the second step, these predictions are further supplemented by _ab-initio_ predictions based on HMM models. The final annotation for the input assembly is produced as a `gff` file. 132 EGAPx takes an assembly fasta file, a taxid of the organism, and RNA-seq data. Based on the taxid, EGAPx will pick protein sets and HMM models. The pipeline runs `miniprot` to align protein sequences, and `STAR` to align RNA-seq to the assembly. Protein alignments and RNA-seq read alignments are then passed to `Gnomon` for gene prediction. In the first step of `Gnomon`, the short alignments are chained together into putative gene models.
133 In the second step, these predictions are further supplemented by *ab-initio* predictions based on HMM models. The final annotation for the input assembly is produced as a `gff` file.
113 134
114 **Security Notice:** 135 **Security Notice:**
115 136
116 EGAPx has dependencies in and outside of its execution path that include several thousand files from the [NCBI C++ toolkit](https://www.ncbi.nlm.nih.gov/toolkit), and more than a million total lines of code. Static Application Security Testing has shown a small number of verified buffer overrun security vulnerabilities. Users should consult with their organizational security team on risk and if there is concern, consider mitigating options like running via VM or cloud instance. 137 EGAPx has dependencies in and outside of its execution path that include several thousand files from the [NCBI C++ toolkit](https://www.ncbi.nlm.nih.gov/toolkit), and more than a million total lines of code. Static Application Security Testing has shown a small number of verified buffer overrun security vulnerabilities. Users should consult with their organizational security team on risk and if there is concern, consider mitigating options like running via VM or cloud instance.
117 138