comparison small_rna_maps.xml @ 23:3ca8113cc758 draft

planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/small_rna_maps commit 15cc0c091844f9b87dc2ec2abd773b4aa26e2a67
author artbio
date Tue, 25 Dec 2018 06:02:08 -0500
parents 29f03c13c7a2
children e75a10eba0a6
comparison
equal deleted inserted replaced
22:29f03c13c7a2 23:3ca8113cc758
1 <tool id="small_rna_maps" name="small_rna_maps" version="2.11.0"> 1 <tool id="small_rna_maps" name="small_rna_maps" version="2.11.1">
2 <description></description> 2 <description></description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="1.11.2=py27_0">numpy</requirement> 4 <requirement type="package" version="1.11.2=py27_0">numpy</requirement>
5 <requirement type="package" version="0.11.2.1=py27_0">pysam</requirement> 5 <requirement type="package" version="0.11.2.1=py27_0">pysam</requirement>
6 <requirement type="package" version="1.3.2=r3.3.2_0">r-optparse</requirement> 6 <requirement type="package" version="1.3.2=r3.3.2_0">r-optparse</requirement>
349 </tests> 349 </tests>
350 <help> 350 <help>
351 351
352 **What it does** 352 **What it does**
353 353
354 Plots mapping statistics of an alignment along the reference chromosomes : 354 Plots mapping statistics of read alignments along reference chromosomes or genes or arbitrary regions :
355 355
356 - counts 356 - counts
357 - mean sizes 357 - mean sizes
358 - median sizes 358 - median sizes
359 - coverage depth 359 - coverage depth
370 Or in all possible pairwise combinations: 370 Or in all possible pairwise combinations:
371 371
372 .. image:: two_plot.png 372 .. image:: two_plot.png
373 373
374 For comparison purposes, values from bam alignment files can be normalized by a size factor 374 For comparison purposes, values from bam alignment files can be normalized by a size factor
375 before plotting. 375 before plotting (Normalisation field)
376
377 *Cluster mode*
378
379 Cluster of read alignments are aggregated along regions of *variable* lengths. The Clustering
380 algorithm works as follows:
381
382 A read is clustered with the following read on the genomic reference if the two reads are
383 separated by at maximum the clustering distance (set in nucleotides). If clustered, the step is
384 repeated with the following read until clustering fails. A new cluster is then searched.
385
386 For clustering procedure, one has the possibility to consider the polarity of reads (only forward
387 reads or reverse reads can be clustered separately), or to ignore this polarity.
388
389 Cluster reads are plotted as for single reads, their coordinate being the median of extrem coordinates of the cluster.
390
391 In addition, cluster are reported in a bed file, where clusters can be filtered out upon various parameters,
392 cluster size, cluster read number or cluster read density (number of reads divided by the length of the cluster).
376 393
377 **Inputs** 394 **Inputs**
378 395
379 bam alignment files that must be 396 bam alignment files that must be
380 397
381 - single-read 398 - single-read
382 - sorted 399 - sorted
383 - mapped to the same reference 400 - mapped to the same reference
384 401
385 To plot 2 alignment files in the same PDF output the 'single dataset' method should be used.
386
387 .. class:: warningmark 402 .. class:: warningmark
388 403
389 If the 'multiple dataset' method is used the normalization factor will be applied to every file selected in the input list. 404 This tools follows a "map-reduce" procedure: multiple inputs, that can be arranged as a data collection,
390 Additionally each file in the selected list will be plotted in a separate PDF file. 405 are visualised side by side in a single pdf file.
406
407
391 408
392 **Output** 409 **Output**
393 410
394 A pdf file generated by the R package lattice and one or two dataframes used to plot the data. 411 A pdf file generated by the R package lattice and one or two dataframes used to plot the data.
395 412