annotate jaccardBed.xml @ 0:b8348686a0b9 draft

Imported from capsule None
author iuc
date Tue, 04 Nov 2014 01:45:04 -0500
parents
children 82aac94b06c3
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
1 <tool id="bedtools_jaccard" name="JaccardBed" version="@WRAPPER_VERSION@.0">
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
2 <description></description>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
3 <macros>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
4 <import>macros.xml</import>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
5 </macros>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
6 <expand macro="requirements" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
7 <expand macro="stdio" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
8 <command>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
9 bedtools jaccard
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
10
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
11 $reciprocal
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
12 $strand
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
13 $split
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
14 -f $overlap
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
15 -a $inputA
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
16 -b $inputB
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
17 &gt; $output
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
18 </command>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
19 <inputs>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
20 <param format="bed,vcf,gff,gff3" name="inputA" type="data" label="BED/VCF/GFF file"/>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
21 <param format="bed,vcf,gff,gff3" name="inputB" type="data" label="BED/VCF/GFF file"/>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
22 <param name="overlap" type="float" value="0.000000001" label="Minimum overlap required as a fraction of A" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
23
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
24 <param name="reciprocal" type="boolean" checked="false" truevalue="-f" falsevalue="" label="Require that the fraction of overlap be reciprocal for A and B. In other words, if -f is 0.90 and -r is used, this requires that B overlap at least 90% of A and that A also overlaps at least 90% of B" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
25 <param name="tab" type="boolean" checked="false" truevalue="-tab" falsevalue="" label="Report extract sequences in a tab-delimited format instead of in FASTA format." />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
26 <param name="strand" type="boolean" checked="false" truevalue="-s" falsevalue="" label="Force strandedness" help="That is, only report hits in B that overlap A on the same strand. By default, overlaps are reported without respect to strand" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
27 <expand macro="strand2" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
28 </inputs>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
29 <outputs>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
30 <data format_source="inputA" name="output" metadata_source="inputA" label="Intersection of ${inputA.name} and ${inputB.name}" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
31 </outputs>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
32 <help>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
33
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
34 **What it does**
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
35
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
36 By default, bedtools jaccard reports the length of the intersection, the length of the union (minus the intersection), the final Jaccard statistic reflecting the similarity of the two sets, as well as the number of intersections.
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
37 Whereas the bedtools intersect tool enumerates each an every intersection between two sets of genomic intervals, one often needs a single statistic reflecting the similarity of the two sets based on the intersections between them. The Jaccard statistic is used in set theory to represent the ratio of the intersection of two sets to the union of the two sets. Similarly, Favorov et al [1] reported the use of the Jaccard statistic for genome intervals: specifically, it measures the ratio of the number of intersecting base pairs between two sets to the number of base pairs in the union of the two sets. The bedtools jaccard tool implements this statistic, yet modifies the statistic such that the length of the intersection is subtracted from the length of the union. As a result, the final statistic ranges from 0.0 to 1.0, where 0.0 represents no overlap and 1.0 represent complete overlap.
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
38
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
39 .. image:: $PATH_TO_IMAGES/jaccard-glyph.png
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
40
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
41 .. class:: warningmark
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
42
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
43 The jaccard tool requires that your data is pre-sorted by chromosome and then by start position (e.g., sort -k1,1 -k2,2n in.bed > in.sorted.bed for BED files).
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
44
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
45 @REFERENCES@
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
46 </help>
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
47 <expand macro="citations" />
b8348686a0b9 Imported from capsule None
iuc
parents:
diff changeset
48 </tool>