Mercurial > repos > nml > spolpred
comparison spolpred.xml @ 0:5402893569cb draft
planemo upload commit 870da8582a7bc43817b1de0720397ae60a8efef6-dirty
author | nml |
---|---|
date | Tue, 15 Dec 2015 14:19:42 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:5402893569cb |
---|---|
1 <?xml version="1.0"?> | |
2 <tool id="spolpred" name="SpolPred" version="1.0.0"> | |
3 <description>with options and commands</description> | |
4 <requirements> | |
5 <requirement type="package" version="1.0.0">spolpred</requirement> | |
6 </requirements> | |
7 <command interpreter="bash"> | |
8 | |
9 #set $output=$input_file.name | |
10 | |
11 spolpred.sh "$input_file.name" $input_file | |
12 | |
13 -l $read_length -b $type_reads -d $more_details -s $screening_options.stop_screening | |
14 | |
15 #if $screening_options.stop_screening == "on": | |
16 -a $screening_options.screening_threshold | |
17 #end if | |
18 | |
19 -m $matching_threshold | |
20 | |
21 </command> | |
22 <inputs> | |
23 <param name="input_file" type="data" format="fastqsanger" label="FASTQ input file"/> | |
24 | |
25 <param name="read_length" type="integer" label="Read length [35, 1000]" value="75"> | |
26 <validator type="in_range" min="35" max="1000" message="Must be between 35 and 1000 (inclusive)"/> | |
27 </param> | |
28 | |
29 <param name="type_reads" type="select" label="Type of input reads"> | |
30 <option value="d">Direct</option> | |
31 <option value="r">Reverse</option> | |
32 </param> | |
33 | |
34 <param name="more_details" type="select" label="Level of processing output detail" | |
35 help="If set on, processing details are output to the job's STDOUT, including | |
36 number of processed reads and number of spacer sequences found"> | |
37 <option value="on">High</option> | |
38 <option value="off">Normal</option> | |
39 </param> | |
40 | |
41 <conditional name="screening_options"> | |
42 <param name="stop_screening" type="select" label="Read screening" | |
43 help="Used to end read processing when Screening Threshold is reached"> | |
44 <option value="on">Perform read screening</option> | |
45 <option value="off">Do not perform read screening</option> | |
46 </param> | |
47 <when value="on"> | |
48 <param name="screening_threshold" type="integer" label="Screening threshold" value="50" | |
49 help="Average number of spacer occurrences used to stop screening"> | |
50 <validator type="in_range" min="0" max="inf" message="Must be at least 0"/> | |
51 </param> | |
52 </when> | |
53 <when value="off"/> | |
54 </conditional> | |
55 | |
56 <param name="matching_threshold" type="integer" label="Matching threshold" value="4" | |
57 help="Minimum number of spacer occurrences below which spacer absence is assigned"> | |
58 <validator type="in_range" min="1" max="inf" message="Must be at least 1"/> | |
59 </param> | |
60 | |
61 </inputs> | |
62 <outputs> | |
63 <data name="outfile" format="tabular" from_work_dir="output.txt"/> | |
64 </outputs> | |
65 | |
66 <tests> | |
67 <test> | |
68 <param/> | |
69 <output/> | |
70 </test> | |
71 </tests> | |
72 | |
73 <help> | |
74 **Frequently Asked Questions** | |
75 | |
76 **SpolPred only accepts one FASTQ file, what if I have got paired-end reads?** | |
77 | |
78 Forward and reverse read files can be merged into one by making use of the Perl script | |
79 shuffleSequences_fastq.pl provided in Velvet software suite. SpolPred run will therefore take longer | |
80 than using only forward or reverse reads. In our dataset (read Methods for more details), the forward file | |
81 had enough reads to find all present spacers and infer the octal code for 49 out of 51 samples. That | |
82 decision will have to be made depending on the sample coverage depth. | |
83 | |
84 | |
85 | |
86 **What if I have a FASTA file?** | |
87 | |
88 SpolPred has been particularly designed to process raw reads and therefore only supports sequence | |
89 files in FASTQ format. | |
90 | |
91 | |
92 | |
93 **What is the point of stopping the read screening?** | |
94 | |
95 By default, all reads in the FASTQ file will be processed. Nevertheless, we have observed that a point is | |
96 reached when no more reads are needed to infer the octal code, in other words, the number of spacer | |
97 occurrences is high enough and steady to assume that all present spacers have already been found. | |
98 Therefore, stopping the program at this point would save time and computer resources. If low coverage | |
99 is the case, stopping the scanning is not advisable. | |
100 | |
101 | |
102 | |
103 **How do I choose the Screening threshold?** | |
104 | |
105 If you have decided to scan the whole input file there is no need to set such threshold. The Screening | |
106 threshold is used to let the program know when the screening should stop. Such value will depend on | |
107 read coverage. Running the software and looking at the number of times all spacers are detected will | |
108 provide insight into both the coverage and the most appropriate threshold value. | |
109 | |
110 | |
111 | |
112 **Why is a Matching threshold required? Are spacers not supposed to occur uniquely?** | |
113 | |
114 The number of times each spacer is found is tracked during the screening and absence assigned when | |
115 such number does not reach a user-defined threshold (4 times by default). This threshold, here called | |
116 Matching threshold, has had to be implemented because for some absent spacers, a few spurious | |
117 matches were found. Those false positives are likely to be related with bad-quality issues, like | |
118 sequencing errors. In our data set, no more than 3 false matches were detected for absent spacers, in | |
119 contrast to 50-150 found per present spacer. | |
120 | |
121 | |
122 | |
123 **Should I be worried then about false positive matches?** | |
124 | |
125 As long as proper pre-filtering steps are carried out to the raw reads, no important issues are expected | |
126 to come up. | |
127 | |
128 | |
129 | |
130 **Can I change the number of allowed SNPs when querying the spacers?** | |
131 | |
132 This option has not been implemented. Spacer sequences are conserved and only one SNP has been | |
133 reported to occur at the most. | |
134 | |
135 | |
136 | |
137 **Why are exact matches output as well?** | |
138 | |
139 The number of read-spacer exact matches, i.e. without allowing SNPs, will enable the easily | |
140 identification of SNPs on spacer sequences. When inferring the octal code, exact matches are not | |
141 employed. | |
142 | |
143 Wrapper Author: Mark Iskander | |
144 </help> | |
145 <citations> | |
146 <citation type="doi">10.1093/bioinformatics/bts544</citation> | |
147 </citations> | |
148 </tool> |