comparison overlapping_reads.xml @ 5:a7fd04208764 draft

planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/small_rna_signatures commit 24d44a9b7ec9db4dce3d839b597eea2b1be34adb
author artbio
date Sat, 09 Sep 2017 11:57:39 -0400
parents 20d28cfdeefe
children 4da23f009c9e
comparison
equal deleted inserted replaced
4:20d28cfdeefe 5:a7fd04208764
1 <tool id="overlapping_reads" name="Get overlapping reads" version="0.9.3"> 1 <tool id="overlapping_reads" name="Get overlapping reads" version="0.9.4">
2 <description /> 2 <description />
3 <requirements> 3 <requirements>
4 <requirement type="package" version="0.11.2.1=py27_0">pysam</requirement> 4 <requirement type="package" version="0.11.2.1=py27_0">pysam</requirement>
5 </requirements> 5 </requirements>
6 <stdio> 6 <stdio>
45 <param name="mintarget" value="23" /> 45 <param name="mintarget" value="23" />
46 <param name="maxtarget" value="29" /> 46 <param name="maxtarget" value="29" />
47 <param name="overlap" value="10" /> 47 <param name="overlap" value="10" />
48 <output file="paired_2.fa" ftype="fasta" name="output" /> 48 <output file="paired_2.fa" ftype="fasta" name="output" />
49 </test> 49 </test>
50 <test>
51 <param ftype="bam" name="input" value="sr_bowtie.bam" />
52 <param name="minquery" value="23" />
53 <param name="maxquery" value="29" />
54 <param name="mintarget" value="20" />
55 <param name="maxtarget" value="22" />
56 <param name="overlap" value="10" />
57 <output file="paired_3.fa" ftype="fasta" name="output" />
58 </test>
59 <test>
60 <param ftype="bam" name="input" value="sr_bowtie.bam" />
61 <param name="minquery" value="20" />
62 <param name="maxquery" value="22" />
63 <param name="mintarget" value="20" />
64 <param name="maxtarget" value="22" />
65 <param name="overlap" value="10" />
66 <output file="paired_4.fa" ftype="fasta" name="output" />
67 </test>
50 </tests> 68 </tests>
51 <help> 69 <help>
52 70
53 **What it does** 71 **What it does**
54 72
68 The algorithm search for each *query* reads (of specified size) in the bam alignment if 86 The algorithm search for each *query* reads (of specified size) in the bam alignment if
69 there are *target* reads (of specified size) that align on the opposite strand with a 10 nt 87 there are *target* reads (of specified size) that align on the opposite strand with a 10 nt
70 overlap. 88 overlap.
71 89
72 Searching query reads of 20-22 nt that overlap by 10 nt with target 90 Searching query reads of 20-22 nt that overlap by 10 nt with target
73 reads of 23-29 nt is different from searching query reads of 23-29 nt that overlap by 10 nt 91 reads of 23-29 nt is equivalent to searching query reads of 23-29 nt that overlap by 10 nt
74 with target reads of 20-22 nt. i.e, searching for siRNAs that pair with piRNAs is distinct 92 with target reads of 20-22 nt. i.e, searching for siRNAs that pair with piRNAs is equivalent
75 from searching for siRNAs that pairs with piRNAs, although of course the number of possibly 93 to searching for siRNAs that pairs with piRNAs. In contrast, searching query reads of 20-22 nt
76 formed piRNA/siRNA pairs is the same as the number of possibly formed siRNA/piRNA pairs. 94 that overlap by 10 nt with target reads of 23-29 nt is different from searching query reads of
95 23-29 nt that overlap by 10 nt with target reads of 23-29 nt, since the number of "heterotypic"
96 pairs of reads is likely to be different from the number of "homotypic" pairs of reads.
77 97
78 *Overlap* 98 *Overlap*
79 The number of nucleotides by which the pairs of sequences will overlap 99 The number of nucleotides by which the pairs of sequences will overlap
80 100
81 101
82 102
83 **Outputs** 103 **Outputs**
84 104
85 a fasta file of pairable reads such as : 105 a fasta file of pairable reads such as :
86 106
87 >FBgn0000004_17.6|5855|F|23|n=1 107 >FBgn0000004_17.6|coord=5839|strand -|size=26|nreads=1
108
109 TTTTCGTCAATTGTGCCAAATAGGTA
110
111 >FBgn0000004_17.6|coord=5855|strand +|size=23|nreads=1
88 112
89 TTGACGAAAATGATCGAGTGGAT 113 TTGACGAAAATGATCGAGTGGAT
90 114
91 >FBgn0000004_17.6|5839|R|26|n=1
92
93 TTTTCGTCAATTGTGCCAAATAGGTA
94 115
95 where FBgn0000004_17.6 stands for the chromosome, 5839 stands for the 1-based read position, 116 where FBgn0000004_17.6 stands for the chromosome, 5839 stands for the 1-based read position,
96 R stand for reverse strand (F forward strand), 26 stands for the size of the sequence and 117 'strand -' stands for lower strand of chromosome, 26 stands for the size of the sequence and
97 n=1 stands for the number of reads of the sequence in the dataset. 118 nreads=1 stands for the number of reads of the sequence in the dataset.
98 119
99 the second sequence in this example corresponds to 1 read that overlap by 10 nt with 120 the second sequence in this example corresponds to 1 read that overlap by 10 nt with
100 1 read of the first sequence. 121 1 read of the first sequence.
101 122
102 </help> 123 </help>