annotate README @ 5:c2bb6aa52c74 draft

planemo upload for repository https://github.com/portiahollyoak/Tools commit 65ddf081d2f1a76bc4d6d91f01ab72667b9e1549-dirty
author portiahollyoak
date Mon, 23 May 2016 06:06:09 -0400
parents f65530d07350
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
1 BreakDancer-1.3.6, released under GPLv3, is a Cpp package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads. It includes two complementary programs. BreakDancerMax predicts five types of structural variants: insertions, deletions, inversions, inter- and intra-chromosomal translocations from next-generation short paired-end sequencing reads using read pairs that are mapped with unexpected separation distances or orientation. BreakDancerMini focuses on detecting small indels (usually between 10bp and 100bp) using normally mapped read pairs. Please read our paper for detailed algorithmic description. http://www.nature.com/nmeth/journal/v6/n9/abs/nmeth.1363.html
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
2
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
3 BreakDancerMax (Update from 1.0 to 1.1 version only applied to cpp now.)
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
4 ----------------------
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
5 Usage: breakdancer_max <analysis_config_file>
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
6 Options:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
7 -o STRING operate on a single chromosome [all chromosome]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
8 -s INT minimum length of a region [7]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
9 -c INT cutoff in unit of standard deviation [3]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
10 -m INT maximum SV size [1000000000]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
11 -q INT minimum alternative mapping quality [35]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
12 -r INT minimum number of read pairs required to establish a connection [2]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
13 -x INT maximum threshold of haploid sequence coverage for regions to be ignored [1000]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
14 -b INT buffer size for building connection [100]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
15 -t only detect transchromosomal rearrangement, by default off
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
16 -d STRING prefix of fastq files that SV supporting reads will be saved by library
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
17 -g STRING dump SVs and supporting reads in BED format for GBrowse
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
18 -l analyze Illumina long insert (mate-pair) library
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
19 -a print out copy number by bam file rather than library, by default on
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
20 -h print out Allele Frequency column, by default off
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
21 -y INT output score filter [40]
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
22
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
23 The followings are the new functions of version 1.1 from 1.0:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
24
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
25 1. It computes the copy number based on the normalization of the whole genome or whole chromosome along with the SV detection. By default the copy number is computed by bam file, but it can also compute per library with option "-a".
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
26 2. The Allele Frequency column is more accurate for DEL type, with the help of the copy number computation. By default it is off. Include option "-h" if you want to look at the DEL type Allele Frequency.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
27 3. Since there are numerous false positive SV calls, the output has a cutoff of the PhredQ score, which by default is 40. Make sure to use option "-y yournumber" if you want to change the cutoff.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
28
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
29 The followings are those existing in version 1.0:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
30
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
31 Most of these options are self-explanatory. It is convenient to use the -o option to parallelize SV detection for each chromosome. When -o is used, the detection of inter-chromosomal translocation is disabled. In that case, it may be convenient to use -t in a separate process to detect putative inter-chromosomal translocations without bothering to analyze read pairs that are mapped to the same chromosome.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
32
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
33 The beta-test x86_64 Cpp version breakdancermax directly utilizes samtools C library. It is fully compatible with the perl version with identical usage and functions but is over 10 times faster. However, it only supports properly formated bam files and has only been tested using bam files produced by BWA. To obtain the correct result, it is important to have readgroup (@RG) tag in both the header and each alignment in the bam files. If you experience technical difficulty with the Cpp version, please email breakdancer-help@lists.sourceforge.net or consider using the perl version.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
34
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
35 The input to BreakDancerMax-1.1 is a set of map files produced by a front-end aligner such as MAQ, BWA, NovoAlign and Bfast, and a tab-delimited configuration file that specifies the locations of the map files, the detection parameters, and the sample information.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
36
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
37 If your map files are in the sam/bam format, you can use the bam2cfg.pl in the released package to automatic generate a configuration file (bam2cfg.pl also has dependence on AlnParser.pm in the release package). If you have a single bam file that contains multiple libraries, make sure that the readgroup and library information are properly encoded in the sam/bam header, and in each alignment record, otherwise bam2cfg.pl may fail to produce a correct configuration file. Please follow instructions on http://samtools.sourceforge.net to properly format your bam files.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
38
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
39 An example manual configuration file is like this
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
40
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
41 map:1.map mean:219 std:18 readlen:36.00 sample:tA exe:maq-0.6.8 mapview
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
42 map:2.map mean:220 std:19 readlen:36.00 sample:tB exe:maq-0.6.8 mapview
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
43 map:3.map mean:219 std:18 readlen:36.00 sample:nA exe:maq-0.7.1 mapview
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
44 map:4.map mean:219 std:18 readlen:36.00 sample:nB exe:maq-0.7.1 mapview
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
45
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
46 An example configuration file produced by bam2cfg.pl look like this:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
47
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
48 readgroup:2825107881 platform:illumina map:tumor.bam readlen:75.00 lib:H_KA-189941-0921313gsc-lib4 num:10001 lower:86.83 upper:443.91 mean:315.09 std:43.92 exe:samtools view
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
49 readgroup:2843249908 platform:illumina map:tumor.bam readlen:75.00 lib:H_KA-189941-0921313gsc-lib4 num:10001 lower:86.83 upper:443.91 mean:315.09 std:43.92 exe:samtools view
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
50 readgroup:2843255910 platform:illumina map:normal.bam readlen:75.00 lib:H_KA-189941-0904663-lib4 num:10001 lower:95.36 upper:443.31 mean:311.68 std:42.86 exe:samtools view
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
51 readgroup:2843255906 platform:illumina map:normal.bam readlen:75.00 lib:H_KA-189941-0904663-lib4 num:10001 lower:95.36 upper:443.31 mean:311.68 std:42.86 exe:samtools view
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
52
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
53 Each row must contain at least 6 key:value pairs (separated by colon) that specify:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
54
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
55 1). the location of the map file
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
56 2). the mean insert size
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
57 3). the standard deviation insert size
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
58 4). the average read length
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
59 5). a unique identifier assigned to the map file (usually representing a PE library)
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
60 6). a command line that can run by perl system calls to produce MAQ mapview alignment
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
61
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
62 In addition to the above 6 keys: map, mean, std, readlen, sample, and exe, BreakDancerMax allows users to explicitly specify the separation thresholds using the keys: upper and lower. For example:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
63 map:1.map upper:300 lower:100 readlen:36.00 sample:tA exe:maq-0.6.8 mapview -b
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
64
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
65 This will instruct BreakDancerMax to detect deletions using read pairs that are at least 300 bp apart (outer distance) and detect insertions using read pairs that are at most 100 bp apart.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
66
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
67 The upper and the lower key:value pairs, when explicitly specified, take precedence over the upper and the lower thresholds computed from the mean, the std, and the user specified threshold in the unit of standard deviation.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
68 upper: mean + std * threshold specified by user option -c
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
69 lower: meen - std * threshold specified by user option -c
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
70
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
71 The -c option by default equals to 3. Therefore, the upper and the lower separation threshold would be: mean + 3 std and mean - 3 std respectively. It is useful to explicitly specify the upper and the lower separation thresholds when the insert size distribution is not symmetric to the mean.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
72
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
73 The -o option enables per-chromosome/reference analysis and is much faster when the input files are in the bam format. Please index the bam file using "samtools index" to utilize this option. You need to specify the exact reference names as they are in the bam files.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
74
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
75 When -e is on, BreakDancerMax tries to estimate the mean and the standard deviation insert size from the data instead of relying on user's spec in the configuration file. Current implementation of this estimation process is slow. So it is recommended that users can specify the accurate thresholds in the configuration file.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
76
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
77 The -l option tell BreakDancerMax that the data is produced from Illumina long insert circularized library
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
78
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
79 The -f option uses the Fisher's methods to summarize scores from multiple libraries. It is recommended when there are many libraries. It ensures that the scores are independent of the number of libraries (uniform distribution of the P values)
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
80
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
81 The -q specifies the MAQ mapping quality threshold and can be used to skip reads that are not confidently mapped.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
82
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
83 The -s specifies the minimal required size of a SV anchoring region from which the anomalously mapped reads are found. This parameter has some small effects on the SV detection accuracy. Increasing -s improves the specificity but also reduces the sensitivity. The default 7 bp seemed to work well.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
84
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
85 The -b parameter specifies the number of anomalous regions resides in the RAM before SV hypotheses begin to form among these regions. The default works well in general. For dataset that is exceptionally large, it may be helpful to reduce it to cut the resident RAM usage.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
86
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
87 The -d specifies a fastq file where all SV supporting reads will be saved in the fastq format. These reads can be realigned by other aligners such as novoalign, and then reanalyzed by BreakDancer.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
88
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
89 Listing multiple map files in a single configuration file would automatically enable pooled analysis: reads from all the map files are jointly analyzed to find unified SV hypotheses across all the map files.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
90
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
91
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
92 The output format
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
93 ----------------------
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
94 BreakDancer's output file consists of the following columns:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
95
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
96 1. Chromosome 1
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
97 2. Position 1
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
98 3. Orientation 1
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
99 4. Chromosome 2
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
100 5. Position 2
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
101 6. Orientation 2
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
102 7. Type of a SV
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
103 8. Size of a SV
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
104 9. Confidence Score
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
105 10. Total number of supporting read pairs
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
106 11. Total number of supporting read pairs from each map file
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
107 12. Estimated allele frequency
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
108 13. Software version
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
109 14. The run parameters
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
110
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
111 Columns 1-3 and 4-6 are used to specify the coordinates of the two SV breakpoints. The orientation is a string that records the number of reads mapped to the plus (+) or the minus (-) strand in the anchoring regions.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
112
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
113 Column 7 is the type of SV detected: DEL (deletions), INS (insertion), INV (inversion), ITX (intra-chromosomal translocation), CTX (inter-chromosomal translocation), and Unknown.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
114 Column 8 is the size of the SV in bp. It is meaningless for inter-chromosomal translocations.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
115 Column 9 is the confidence score associated with the prediction.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
116 Column 11 can be used to dissect the origin of the supporting read pairs, which is useful in pooled analysis. For example, one may want to give SVs that are supported by more than one libraries higher confidence than those detected in only one library. It can also be used to distinguish somatic events from the germline, i.e., those detected in only the tumor libraries versus those detected in both the tumor and the normal libraries.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
117 Column 12 is currently a placeholder for displaying estimated allele frequency. The allele frequencies estimated in this version are not accurate and should not be trusted.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
118 Column 13 and 14 are information useful to reproduce the results.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
119
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
120 Example 1:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
121 1 10000 10+0- 2 20000 7+10- CTX -296 99 10 tB|10 1.00 BreakDancerMax-0.0.1 t1
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
122
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
123 An inter-chromosomal translocation that starts from chr1:10000 and goes into chr2:20000 with 10 supporting read pairs from the library tB and a confidence score of 99.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
124
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
125 Example 2:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
126 1 59257 5+1- 1 60164 0+5- DEL 862 99 5 nA|2:tB|1 0.56 BreakDancerMax-0.0.1 c4
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
127
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
128 A deletion between chr1:59257 and chr1:60164 connected by 5 read pairs, among which 2 in library nA and 1 in library tB support the deletion hypothesis. This deletion is detected by BreakDancerMax-0.0.1 with a separation threshold of 4 s.d.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
129
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
130 Example 3:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
131 1 62767 10+0- 1 63126 0+10- INS -13 36 10 NA|10 1.00 BreakDancerMini-0.0.1 q10
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
132
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
133 An 13 bp insertion detected by BreakDancerMini between chr1:62767 and chr1:63126 with 10 supporting read pairs from a single library 'NA' and a confidence score of 36.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
134
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
135 Notes:
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
136 Real SV breakpoints are expected to reside within the predicted boundaries with a margin > the read length.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
137
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
138 The BreakDancerMini code will not be included in the coming releases. We recommend using Pindel to detect intermediate size indels (10-80 bp).
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
139
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
140 Dependence over other Perl Modules
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
141 ------------------------------------
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
142 use Statistics::Descriptive;
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
143 use Math::CDF;
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
144
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
145 These are available at CPAN.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
146 http://search.cpan.org/~colink/Statistics-Descriptive-2.6/Descriptive.pm
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
147 http://search.cpan.org/~callahan/Math-CDF-0.1/CDF.pm
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
148
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
149 use Poisson;
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
150 This is provided. Please make sure the "use lib" at the beginning of the perl scripts contains the correct path.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
151
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
152
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
153 Acknowledgements
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
154 -----------------
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
155 Heng Li at Wellcome Trust Sanger Institute has contributed an early version of this code.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
156 Many colleagues at The Genome Center of Washington University have supported this effort.
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
157
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
158
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
159 Ken Chen and Xian Fan
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
160 Washington University Genome Center
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
161
f65530d07350 planemo upload for repository https://github.com/portiahollyoak/Tools commit 1f1c277219ca756c9baa453592b455597fd593d8-dirty
portiahollyoak
parents:
diff changeset
162 July 14, 2009