Mercurial > repos > cpt > cpt_putative_osp
comparison generate-putative-osp.xml @ 1:05b97a4dce94 draft
planemo upload commit 94b0cd1fff0826c6db3e7dc0c91c0c5a8be8bb0c
author | cpt |
---|---|
date | Mon, 05 Jun 2023 02:51:44 +0000 |
parents | |
children | 05244a021b80 |
comparison
equal
deleted
inserted
replaced
0:03670eba3480 | 1:05b97a4dce94 |
---|---|
1 <tool id="edu.tamu.cpt2.spanin.generate-putative-osp" name="OSP candidates" version="1.0"> | |
2 <description>constructs a putative list of potential o-spanin from an input genomic FASTA</description> | |
3 <macros> | |
4 <import>macros.xml</import> | |
5 <import>cpt-macros.xml</import> | |
6 </macros> | |
7 <expand macro="requirements"> | |
8 </expand> | |
9 <command detect_errors="aggressive"><![CDATA[ | |
10 '$__tool_directory__/generate-putative-osp.py' | |
11 '$fasta_file' | |
12 --strand '$strand' | |
13 --switch '$switch' | |
14 --osp_on '$osp_on' | |
15 --osp_op '$osp_op' | |
16 --osp_ob '$osp_ob' | |
17 --osp_og '$osp_og' | |
18 --osp_min_len '$osp_min_len' | |
19 --putative_osp '$putative_osp' | |
20 --summary_osp_txt '$summary_osp' | |
21 --putative_osp_gff '$putative_osp_gff' | |
22 --min_lipo_after '$lipo_min' | |
23 --max_lipo_after '$lipo_max' | |
24 --osp_max '$osp_max' | |
25 ]]></command> | |
26 <inputs> | |
27 <param type="select" label="Strand Choice" name="strand"> | |
28 <option value="both">both</option> | |
29 <option value="forward">+</option> | |
30 <option value="reverse">-</option> | |
31 </param> | |
32 <param label="Single Genome FASTA" name="fasta_file" type="data" format="fasta"/> | |
33 <param label="o-spanin minimal length" name="osp_min_len" type="integer" value="45"/> | |
34 <param label="o-spanin maximum length" name="osp_max" type="integer" value="200"/> | |
35 <param label="Range Selection; default is all; for a specific range to check for a spanin input integers separated by a colon (eg. 1234:4321)" type="text" name="switch" value="all"/> | |
36 <param label="Lipobox minimal distance from start codon" name="osp_min_dist" type="integer" value="10"/> | |
37 <param label="Lipobox maximum distance from start codon" name="osp_max_dist" type="integer" value="60" help="Searches for a Lipobox between Lipoboxmin and Lipoboxmax ie [Lipoboxmin,Lipoboxmax]"/> | |
38 <param label="Minimum amount of residues after lipobox is found" name="lipo_min" type="integer" value="25"/> | |
39 <param label="Maximum amount of residues after lipobox is found" name="lipo_max" type="integer" value="170"/> | |
40 </inputs> | |
41 <outputs> | |
42 <data format="fasta" name="osp_on" label="NucSequences.fa" hidden="true"/> | |
43 <data format="fasta" name="osp_op" label="ProtSequences.fa" hidden="true"/> | |
44 <data format="bed" name="osp_ob" label="BED_Output.bed" hidden="true"/> | |
45 <data format="gff3" name="osp_og" label="GFF_Output.gff" hidden="true"/> | |
46 <data format="fasta" name="putative_osp" label="putative_osp.fa"/> | |
47 <data format="txt" name="summary_osp" label="summary_osp.txt"/> | |
48 <data format="gff3" name="putative_osp_gff" label="putative_osp.gff3"/> | |
49 </outputs> | |
50 <help><![CDATA[ | |
51 | |
52 **What it does** | |
53 Searches a genome for candidate o-spanins (OSPs), a phage protein involved in outer membrane disruption during Gram-negative bacterial host cell lysis. | |
54 | |
55 | |
56 **METHODOLOGY** | |
57 | |
58 Locates ALL potential start sequences, based on TTG / ATG / GTG (M / L / V). This list is pared down to those within the user-set min/max lengths. That filtered list generates a set of files with the ORFs in FASTA (nt and aa), BED, and GFF3 file formats. | |
59 | |
60 For each sequence in the protein FASTA, the tool then checks within the user-specified range (min/max distance from start codon) for a regular expression (RegEx) to identify a potential lipobox. The following residues are allowed for the potential lipobox: | |
61 | |
62 * [ILMFTV][^REKD][GAS]C | |
63 * AW[AGS]C | |
64 | |
65 Finally, the protein list is filtered for size with user-set periplasmic length parameters, calculated as the number of residues after the putative lipobox. | |
66 | |
67 **INPUT** --> Genomic FASTA | |
68 *NOTE: This tool only takes a SINGLE genomic fasta. It does not work with multiFASTAs.* | |
69 | |
70 **OUTPUT** --> putative_osp.fa (FASTA) file, putative_osp.gff3, and basic summary statistics file as sumamry_osp.txt | |
71 Protein sequences which passed the above filters are returned as the candidate OSPs. | |
72 ]]></help> | |
73 <expand macro="citations-crr"/> | |
74 </tool> |