# HG changeset patch
# User nml
# Date 1450207182 18000
# Node ID 5402893569cb1a87f992511ac0667713debe5345
planemo upload commit 870da8582a7bc43817b1de0720397ae60a8efef6-dirty
diff -r 000000000000 -r 5402893569cb spolpred.sh
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/spolpred.sh Tue Dec 15 14:19:42 2015 -0500
@@ -0,0 +1,10 @@
+#/bin/bash
+
+name=$1
+shift
+
+spolpred $@
+
+sed -i s/^.*\t/$name/ output.txt
+
+exit 0
diff -r 000000000000 -r 5402893569cb spolpred.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/spolpred.xml Tue Dec 15 14:19:42 2015 -0500
@@ -0,0 +1,148 @@
+
+
+ with options and commands
+
+ spolpred
+
+
+
+#set $output=$input_file.name
+
+spolpred.sh "$input_file.name" $input_file
+
+-l $read_length -b $type_reads -d $more_details -s $screening_options.stop_screening
+
+#if $screening_options.stop_screening == "on":
+ -a $screening_options.screening_threshold
+#end if
+
+-m $matching_threshold
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ **Frequently Asked Questions**
+
+ **SpolPred only accepts one FASTQ file, what if I have got paired-end reads?**
+
+ Forward and reverse read files can be merged into one by making use of the Perl script
+ shuffleSequences_fastq.pl provided in Velvet software suite. SpolPred run will therefore take longer
+ than using only forward or reverse reads. In our dataset (read Methods for more details), the forward file
+ had enough reads to find all present spacers and infer the octal code for 49 out of 51 samples. That
+ decision will have to be made depending on the sample coverage depth.
+
+
+
+ **What if I have a FASTA file?**
+
+ SpolPred has been particularly designed to process raw reads and therefore only supports sequence
+ files in FASTQ format.
+
+
+
+ **What is the point of stopping the read screening?**
+
+ By default, all reads in the FASTQ file will be processed. Nevertheless, we have observed that a point is
+ reached when no more reads are needed to infer the octal code, in other words, the number of spacer
+ occurrences is high enough and steady to assume that all present spacers have already been found.
+ Therefore, stopping the program at this point would save time and computer resources. If low coverage
+ is the case, stopping the scanning is not advisable.
+
+
+
+ **How do I choose the Screening threshold?**
+
+ If you have decided to scan the whole input file there is no need to set such threshold. The Screening
+ threshold is used to let the program know when the screening should stop. Such value will depend on
+ read coverage. Running the software and looking at the number of times all spacers are detected will
+ provide insight into both the coverage and the most appropriate threshold value.
+
+
+
+ **Why is a Matching threshold required? Are spacers not supposed to occur uniquely?**
+
+ The number of times each spacer is found is tracked during the screening and absence assigned when
+ such number does not reach a user-defined threshold (4 times by default). This threshold, here called
+ Matching threshold, has had to be implemented because for some absent spacers, a few spurious
+ matches were found. Those false positives are likely to be related with bad-quality issues, like
+ sequencing errors. In our data set, no more than 3 false matches were detected for absent spacers, in
+ contrast to 50-150 found per present spacer.
+
+
+
+ **Should I be worried then about false positive matches?**
+
+ As long as proper pre-filtering steps are carried out to the raw reads, no important issues are expected
+ to come up.
+
+
+
+ **Can I change the number of allowed SNPs when querying the spacers?**
+
+ This option has not been implemented. Spacer sequences are conserved and only one SNP has been
+ reported to occur at the most.
+
+
+
+ **Why are exact matches output as well?**
+
+ The number of read-spacer exact matches, i.e. without allowing SNPs, will enable the easily
+ identification of SNPs on spacer sequences. When inferring the octal code, exact matches are not
+ employed.
+
+ Wrapper Author: Mark Iskander
+
+
+ 10.1093/bioinformatics/bts544
+
+
diff -r 000000000000 -r 5402893569cb tool_dependencies.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_dependencies.xml Tue Dec 15 14:19:42 2015 -0500
@@ -0,0 +1,6 @@
+
+
+
+
+
+