annotate PsiCLASS-1.0.2/samtools-0.1.19/samtools.1 @ 0:903fc43d6227 draft default tip

Uploaded
author lsong10
date Fri, 26 Mar 2021 16:52:45 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1 .TH samtools 1 "15 March 2013" "samtools-0.1.19" "Bioinformatics tools"
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
2 .SH NAME
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
3 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
4 samtools - Utilities for the Sequence Alignment/Map (SAM) format
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
5
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
6 bcftools - Utilities for the Binary Call Format (BCF) and VCF
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
7 .SH SYNOPSIS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
8 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
9 samtools view -bt ref_list.txt -o aln.bam aln.sam.gz
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
10 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
11 samtools sort aln.bam aln.sorted
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
12 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
13 samtools index aln.sorted.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
14 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
15 samtools idxstats aln.sorted.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
16 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
17 samtools view aln.sorted.bam chr2:20,100,000-20,200,000
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
18 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
19 samtools merge out.bam in1.bam in2.bam in3.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
20 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
21 samtools faidx ref.fasta
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
22 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
23 samtools pileup -vcf ref.fasta aln.sorted.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
24 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
25 samtools mpileup -C50 -gf ref.fasta -r chr3:1,000-2,000 in1.bam in2.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
26 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
27 samtools tview aln.sorted.bam ref.fasta
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
28 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
29 bcftools index in.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
30 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
31 bcftools view in.bcf chr2:100-200 > out.vcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
32 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
33 bcftools view -Nvm0.99 in.bcf > out.vcf 2> out.afs
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
34
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
35 .SH DESCRIPTION
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
36 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
37 Samtools is a set of utilities that manipulate alignments in the BAM
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
38 format. It imports from and exports to the SAM (Sequence Alignment/Map)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
39 format, does sorting, merging and indexing, and allows to retrieve reads
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
40 in any regions swiftly.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
41
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
42 Samtools is designed to work on a stream. It regards an input file `-'
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
43 as the standard input (stdin) and an output file `-' as the standard
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
44 output (stdout). Several commands can thus be combined with Unix
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
45 pipes. Samtools always output warning and error messages to the standard
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
46 error output (stderr).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
47
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
48 Samtools is also able to open a BAM (not SAM) file on a remote FTP or
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
49 HTTP server if the BAM file name starts with `ftp://' or `http://'.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
50 Samtools checks the current working directory for the index file and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
51 will download the index upon absence. Samtools does not retrieve the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
52 entire alignment file unless it is asked to do so.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
53
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
54 .SH SAMTOOLS COMMANDS AND OPTIONS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
55
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
56 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
57 .B view
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
58 samtools view [-bchuHS] [-t in.refList] [-o output] [-f reqFlag] [-F
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
59 skipFlag] [-q minMapQ] [-l library] [-r readGroup] [-R rgFile] <in.bam>|<in.sam> [region1 [...]]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
60
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
61 Extract/print all or sub alignments in SAM or BAM format. If no region
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
62 is specified, all the alignments will be printed; otherwise only
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
63 alignments overlapping the specified regions will be output. An
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
64 alignment may be given multiple times if it is overlapping several
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
65 regions. A region can be presented, for example, in the following
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
66 format: `chr2' (the whole chr2), `chr2:1000000' (region starting from
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
67 1,000,000bp) or `chr2:1,000,000-2,000,000' (region between 1,000,000 and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
68 2,000,000bp including the end points). The coordinate is 1-based.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
69
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
70 .B OPTIONS:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
71 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
72 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
73 .B -b
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
74 Output in the BAM format.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
75 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
76 .BI -f \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
77 Only output alignments with all bits in INT present in the FLAG
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
78 field. INT can be in hex in the format of /^0x[0-9A-F]+/ [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
79 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
80 .BI -F \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
81 Skip alignments with bits present in INT [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
82 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
83 .B -h
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
84 Include the header in the output.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
85 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
86 .B -H
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
87 Output the header only.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
88 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
89 .BI -l \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
90 Only output reads in library STR [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
91 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
92 .BI -o \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
93 Output file [stdout]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
94 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
95 .BI -q \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
96 Skip alignments with MAPQ smaller than INT [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
97 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
98 .BI -r \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
99 Only output reads in read group STR [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
100 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
101 .BI -R \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
102 Output reads in read groups listed in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
103 .I FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
104 [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
105 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
106 .BI -s \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
107 Fraction of templates/pairs to subsample; the integer part is treated as the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
108 seed for the random number generator [-1]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
109 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
110 .B -S
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
111 Input is in SAM. If @SQ header lines are absent, the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
112 .B `-t'
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
113 option is required.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
114 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
115 .B -c
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
116 Instead of printing the alignments, only count them and print the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
117 total number. All filter options, such as
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
118 .B `-f',
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
119 .B `-F'
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
120 and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
121 .B `-q'
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
122 , are taken into account.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
123 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
124 .BI -t \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
125 This file is TAB-delimited. Each line must contain the reference name
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
126 and the length of the reference, one line for each distinct reference;
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
127 additional fields are ignored. This file also defines the order of the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
128 reference sequences in sorting. If you run `samtools faidx <ref.fa>',
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
129 the resultant index file
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
130 .I <ref.fa>.fai
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
131 can be used as this
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
132 .I <in.ref_list>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
133 file.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
134 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
135 .B -u
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
136 Output uncompressed BAM. This option saves time spent on
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
137 compression/decomprssion and is thus preferred when the output is piped
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
138 to another samtools command.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
139 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
140
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
141 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
142 .B tview
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
143 samtools tview
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
144 .RB [ \-p
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
145 .IR chr:pos ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
146 .RB [ \-s
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
147 .IR STR ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
148 .RB [ \-d
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
149 .IR display ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
150 .RI <in.sorted.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
151 .RI [ref.fasta]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
152
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
153 Text alignment viewer (based on the ncurses library). In the viewer,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
154 press `?' for help and press `g' to check the alignment start from a
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
155 region in the format like `chr10:10,000,000' or `=10,000,000' when
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
156 viewing the same reference sequence.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
157
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
158 .B Options:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
159 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
160 .TP 14
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
161 .BI -d \ display
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
162 Output as (H)tml or (C)urses or (T)ext
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
163 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
164 .BI -p \ chr:pos
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
165 Go directly to this position
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
166 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
167 .BI -s \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
168 Display only reads from this sample or read group
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
169 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
170
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
171 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
172 .B mpileup
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
173 samtools mpileup
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
174 .RB [ \-EBugp ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
175 .RB [ \-C
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
176 .IR capQcoef ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
177 .RB [ \-r
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
178 .IR reg ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
179 .RB [ \-f
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
180 .IR in.fa ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
181 .RB [ \-l
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
182 .IR list ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
183 .RB [ \-M
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
184 .IR capMapQ ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
185 .RB [ \-Q
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
186 .IR minBaseQ ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
187 .RB [ \-q
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
188 .IR minMapQ ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
189 .I in.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
190 .RI [ in2.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
191 .RI [ ... ]]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
192
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
193 Generate BCF or pileup for one or multiple BAM files. Alignment records
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
194 are grouped by sample identifiers in @RG header lines. If sample
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
195 identifiers are absent, each input file is regarded as one sample.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
196
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
197 In the pileup format (without
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
198 .BR -u or -g ),
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
199 each
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
200 line represents a genomic position, consisting of chromosome name,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
201 coordinate, reference base, read bases, read qualities and alignment
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
202 mapping qualities. Information on match, mismatch, indel, strand,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
203 mapping quality and start and end of a read are all encoded at the read
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
204 base column. At this column, a dot stands for a match to the reference
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
205 base on the forward strand, a comma for a match on the reverse strand,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
206 a '>' or '<' for a reference skip, `ACGTN' for a mismatch on the forward
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
207 strand and `acgtn' for a mismatch on the reverse strand. A pattern
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
208 `\\+[0-9]+[ACGTNacgtn]+' indicates there is an insertion between this
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
209 reference position and the next reference position. The length of the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
210 insertion is given by the integer in the pattern, followed by the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
211 inserted sequence. Similarly, a pattern `-[0-9]+[ACGTNacgtn]+'
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
212 represents a deletion from the reference. The deleted bases will be
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
213 presented as `*' in the following lines. Also at the read base column, a
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
214 symbol `^' marks the start of a read. The ASCII of the character
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
215 following `^' minus 33 gives the mapping quality. A symbol `$' marks the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
216 end of a read segment.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
217
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
218 .B Input Options:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
219 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
220 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
221 .B -6
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
222 Assume the quality is in the Illumina 1.3+ encoding.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
223 .B -A
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
224 Do not skip anomalous read pairs in variant calling.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
225 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
226 .B -B
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
227 Disable probabilistic realignment for the computation of base alignment
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
228 quality (BAQ). BAQ is the Phred-scaled probability of a read base being
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
229 misaligned. Applying this option greatly helps to reduce false SNPs
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
230 caused by misalignments.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
231 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
232 .BI -b \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
233 List of input BAM files, one file per line [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
234 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
235 .BI -C \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
236 Coefficient for downgrading mapping quality for reads containing
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
237 excessive mismatches. Given a read with a phred-scaled probability q of
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
238 being generated from the mapped position, the new mapping quality is
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
239 about sqrt((INT-q)/INT)*INT. A zero value disables this
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
240 functionality; if enabled, the recommended value for BWA is 50. [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
241 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
242 .BI -d \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
243 At a position, read maximally
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
244 .I INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
245 reads per input BAM. [250]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
246 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
247 .B -E
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
248 Extended BAQ computation. This option helps sensitivity especially for MNPs, but may hurt
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
249 specificity a little bit.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
250 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
251 .BI -f \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
252 The
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
253 .BR faidx -indexed
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
254 reference file in the FASTA format. The file can be optionally compressed by
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
255 .BR razip .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
256 [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
257 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
258 .BI -l \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
259 BED or position list file containing a list of regions or sites where pileup or BCF should be generated [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
260 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
261 .BI -q \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
262 Minimum mapping quality for an alignment to be used [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
263 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
264 .BI -Q \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
265 Minimum base quality for a base to be considered [13]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
266 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
267 .BI -r \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
268 Only generate pileup in region
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
269 .I STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
270 [all sites]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
271 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
272 .B Output Options:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
273
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
274 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
275 .B -D
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
276 Output per-sample read depth
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
277 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
278 .B -g
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
279 Compute genotype likelihoods and output them in the binary call format (BCF).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
280 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
281 .B -S
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
282 Output per-sample Phred-scaled strand bias P-value
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
283 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
284 .B -u
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
285 Similar to
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
286 .B -g
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
287 except that the output is uncompressed BCF, which is preferred for piping.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
288
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
289 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
290 .B Options for Genotype Likelihood Computation (for -g or -u):
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
291
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
292 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
293 .BI -e \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
294 Phred-scaled gap extension sequencing error probability. Reducing
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
295 .I INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
296 leads to longer indels. [20]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
297 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
298 .BI -h \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
299 Coefficient for modeling homopolymer errors. Given an
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
300 .IR l -long
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
301 homopolymer
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
302 run, the sequencing error of an indel of size
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
303 .I s
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
304 is modeled as
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
305 .IR INT * s / l .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
306 [100]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
307 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
308 .B -I
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
309 Do not perform INDEL calling
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
310 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
311 .BI -L \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
312 Skip INDEL calling if the average per-sample depth is above
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
313 .IR INT .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
314 [250]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
315 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
316 .BI -o \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
317 Phred-scaled gap open sequencing error probability. Reducing
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
318 .I INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
319 leads to more indel calls. [40]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
320 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
321 .BI -p
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
322 Apply -m and -F thresholds per sample to increase sensitivity of calling.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
323 By default both options are applied to reads pooled from all samples.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
324 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
325 .BI -P \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
326 Comma dilimited list of platforms (determined by
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
327 .BR @RG-PL )
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
328 from which indel candidates are obtained. It is recommended to collect
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
329 indel candidates from sequencing technologies that have low indel error
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
330 rate such as ILLUMINA. [all]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
331 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
332
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
333 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
334 .B reheader
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
335 samtools reheader <in.header.sam> <in.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
336
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
337 Replace the header in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
338 .I in.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
339 with the header in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
340 .I in.header.sam.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
341 This command is much faster than replacing the header with a
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
342 BAM->SAM->BAM conversion.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
343
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
344 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
345 .B cat
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
346 samtools cat [-h header.sam] [-o out.bam] <in1.bam> <in2.bam> [ ... ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
347
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
348 Concatenate BAMs. The sequence dictionary of each input BAM must be identical,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
349 although this command does not check this. This command uses a similar trick
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
350 to
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
351 .B reheader
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
352 which enables fast BAM concatenation.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
353
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
354 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
355 .B sort
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
356 samtools sort [-nof] [-m maxMem] <in.bam> <out.prefix>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
357
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
358 Sort alignments by leftmost coordinates. File
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
359 .I <out.prefix>.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
360 will be created. This command may also create temporary files
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
361 .I <out.prefix>.%d.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
362 when the whole alignment cannot be fitted into memory (controlled by
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
363 option -m).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
364
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
365 .B OPTIONS:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
366 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
367 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
368 .B -o
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
369 Output the final alignment to the standard output.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
370 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
371 .B -n
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
372 Sort by read names rather than by chromosomal coordinates
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
373 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
374 .B -f
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
375 Use
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
376 .I <out.prefix>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
377 as the full output path and do not append
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
378 .I .bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
379 suffix.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
380 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
381 .BI -m \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
382 Approximately the maximum required memory. [500000000]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
383 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
384
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
385 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
386 .B merge
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
387 samtools merge [-nur1f] [-h inh.sam] [-R reg] <out.bam> <in1.bam> <in2.bam> [...]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
388
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
389 Merge multiple sorted alignments.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
390 The header reference lists of all the input BAM files, and the @SQ headers of
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
391 .IR inh.sam ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
392 if any, must all refer to the same set of reference sequences.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
393 The header reference list and (unless overridden by
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
394 .BR -h )
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
395 `@' headers of
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
396 .I in1.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
397 will be copied to
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
398 .IR out.bam ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
399 and the headers of other files will be ignored.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
400
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
401 .B OPTIONS:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
402 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
403 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
404 .B -1
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
405 Use zlib compression level 1 to comrpess the output
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
406 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
407 .B -f
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
408 Force to overwrite the output file if present.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
409 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
410 .BI -h \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
411 Use the lines of
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
412 .I FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
413 as `@' headers to be copied to
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
414 .IR out.bam ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
415 replacing any header lines that would otherwise be copied from
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
416 .IR in1.bam .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
417 .RI ( FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
418 is actually in SAM format, though any alignment records it may contain
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
419 are ignored.)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
420 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
421 .B -n
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
422 The input alignments are sorted by read names rather than by chromosomal
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
423 coordinates
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
424 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
425 .BI -R \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
426 Merge files in the specified region indicated by
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
427 .I STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
428 [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
429 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
430 .B -r
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
431 Attach an RG tag to each alignment. The tag value is inferred from file names.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
432 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
433 .B -u
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
434 Uncompressed BAM output
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
435 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
436
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
437 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
438 .B index
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
439 samtools index <aln.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
440
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
441 Index sorted alignment for fast random access. Index file
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
442 .I <aln.bam>.bai
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
443 will be created.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
444
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
445 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
446 .B idxstats
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
447 samtools idxstats <aln.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
448
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
449 Retrieve and print stats in the index file. The output is TAB delimited
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
450 with each line consisting of reference sequence name, sequence length, #
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
451 mapped reads and # unmapped reads.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
452
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
453 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
454 .B faidx
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
455 samtools faidx <ref.fasta> [region1 [...]]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
456
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
457 Index reference sequence in the FASTA format or extract subsequence from
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
458 indexed reference sequence. If no region is specified,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
459 .B faidx
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
460 will index the file and create
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
461 .I <ref.fasta>.fai
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
462 on the disk. If regions are speficified, the subsequences will be
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
463 retrieved and printed to stdout in the FASTA format. The input file can
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
464 be compressed in the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
465 .B RAZF
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
466 format.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
467
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
468 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
469 .B fixmate
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
470 samtools fixmate <in.nameSrt.bam> <out.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
471
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
472 Fill in mate coordinates, ISIZE and mate related flags from a
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
473 name-sorted alignment.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
474
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
475 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
476 .B rmdup
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
477 samtools rmdup [-sS] <input.srt.bam> <out.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
478
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
479 Remove potential PCR duplicates: if multiple read pairs have identical
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
480 external coordinates, only retain the pair with highest mapping quality.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
481 In the paired-end mode, this command
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
482 .B ONLY
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
483 works with FR orientation and requires ISIZE is correctly set. It does
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
484 not work for unpaired reads (e.g. two ends mapped to different
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
485 chromosomes or orphan reads).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
486
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
487 .B OPTIONS:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
488 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
489 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
490 .B -s
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
491 Remove duplicate for single-end reads. By default, the command works for
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
492 paired-end reads only.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
493 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
494 .B -S
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
495 Treat paired-end reads and single-end reads.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
496 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
497
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
498 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
499 .B calmd
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
500 samtools calmd [-EeubSr] [-C capQcoef] <aln.bam> <ref.fasta>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
501
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
502 Generate the MD tag. If the MD tag is already present, this command will
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
503 give a warning if the MD tag generated is different from the existing
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
504 tag. Output SAM by default.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
505
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
506 .B OPTIONS:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
507 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
508 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
509 .B -A
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
510 When used jointly with
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
511 .B -r
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
512 this option overwrites the original base quality.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
513 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
514 .B -e
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
515 Convert a the read base to = if it is identical to the aligned reference
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
516 base. Indel caller does not support the = bases at the moment.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
517 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
518 .B -u
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
519 Output uncompressed BAM
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
520 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
521 .B -b
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
522 Output compressed BAM
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
523 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
524 .B -S
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
525 The input is SAM with header lines
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
526 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
527 .BI -C \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
528 Coefficient to cap mapping quality of poorly mapped reads. See the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
529 .B pileup
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
530 command for details. [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
531 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
532 .B -r
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
533 Compute the BQ tag (without -A) or cap base quality by BAQ (with -A).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
534 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
535 .B -E
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
536 Extended BAQ calculation. This option trades specificity for sensitivity, though the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
537 effect is minor.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
538 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
539
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
540 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
541 .B targetcut
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
542 samtools targetcut [-Q minBaseQ] [-i inPenalty] [-0 em0] [-1 em1] [-2 em2] [-f ref] <in.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
543
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
544 This command identifies target regions by examining the continuity of read depth, computes
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
545 haploid consensus sequences of targets and outputs a SAM with each sequence corresponding
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
546 to a target. When option
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
547 .B -f
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
548 is in use, BAQ will be applied. This command is
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
549 .B only
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
550 designed for cutting fosmid clones from fosmid pool sequencing [Ref. Kitzman et al. (2010)].
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
551 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
552
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
553 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
554 .B phase
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
555 samtools phase [-AF] [-k len] [-b prefix] [-q minLOD] [-Q minBaseQ] <in.bam>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
556
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
557 Call and phase heterozygous SNPs.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
558 .B OPTIONS:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
559 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
560 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
561 .B -A
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
562 Drop reads with ambiguous phase.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
563 .TP 8
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
564 .BI -b \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
565 Prefix of BAM output. When this option is in use, phase-0 reads will be saved in file
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
566 .BR STR .0.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
567 and phase-1 reads in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
568 .BR STR .1.bam.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
569 Phase unknown reads will be randomly allocated to one of the two files. Chimeric reads
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
570 with switch errors will be saved in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
571 .BR STR .chimeric.bam.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
572 [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
573 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
574 .B -F
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
575 Do not attempt to fix chimeric reads.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
576 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
577 .BI -k \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
578 Maximum length for local phasing. [13]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
579 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
580 .BI -q \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
581 Minimum Phred-scaled LOD to call a heterozygote. [40]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
582 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
583 .BI -Q \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
584 Minimum base quality to be used in het calling. [13]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
585 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
586
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
587 .SH BCFTOOLS COMMANDS AND OPTIONS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
588
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
589 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
590 .B view
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
591 .B bcftools view
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
592 .RB [ \-AbFGNQSucgv ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
593 .RB [ \-D
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
594 .IR seqDict ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
595 .RB [ \-l
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
596 .IR listLoci ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
597 .RB [ \-s
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
598 .IR listSample ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
599 .RB [ \-i
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
600 .IR gapSNPratio ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
601 .RB [ \-t
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
602 .IR mutRate ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
603 .RB [ \-p
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
604 .IR varThres ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
605 .RB [ \-m
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
606 .IR varThres ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
607 .RB [ \-P
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
608 .IR prior ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
609 .RB [ \-1
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
610 .IR nGroup1 ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
611 .RB [ \-d
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
612 .IR minFrac ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
613 .RB [ \-U
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
614 .IR nPerm ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
615 .RB [ \-X
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
616 .IR permThres ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
617 .RB [ \-T
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
618 .IR trioType ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
619 .I in.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
620 .RI [ region ]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
621
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
622 Convert between BCF and VCF, call variant candidates and estimate allele
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
623 frequencies.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
624
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
625 .RS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
626 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
627 .B Input/Output Options:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
628 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
629 .B -A
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
630 Retain all possible alternate alleles at variant sites. By default, the view
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
631 command discards unlikely alleles.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
632 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
633 .B -b
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
634 Output in the BCF format. The default is VCF.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
635 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
636 .BI -D \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
637 Sequence dictionary (list of chromosome names) for VCF->BCF conversion [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
638 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
639 .B -F
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
640 Indicate PL is generated by r921 or before (ordering is different).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
641 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
642 .B -G
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
643 Suppress all individual genotype information.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
644 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
645 .BI -l \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
646 List of sites at which information are outputted [all sites]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
647 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
648 .B -N
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
649 Skip sites where the REF field is not A/C/G/T
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
650 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
651 .B -Q
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
652 Output the QCALL likelihood format
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
653 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
654 .BI -s \ FILE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
655 List of samples to use. The first column in the input gives the sample names
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
656 and the second gives the ploidy, which can only be 1 or 2. When the 2nd column
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
657 is absent, the sample ploidy is assumed to be 2. In the output, the ordering of
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
658 samples will be identical to the one in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
659 .IR FILE .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
660 [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
661 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
662 .B -S
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
663 The input is VCF instead of BCF.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
664 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
665 .B -u
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
666 Uncompressed BCF output (force -b).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
667 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
668 .B Consensus/Variant Calling Options:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
669 .TP 10
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
670 .B -c
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
671 Call variants using Bayesian inference. This option automatically invokes option
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
672 .BR -e .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
673 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
674 .BI -d \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
675 When
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
676 .B -v
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
677 is in use, skip loci where the fraction of samples covered by reads is below FLOAT. [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
678 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
679 .B -e
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
680 Perform max-likelihood inference only, including estimating the site allele frequency,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
681 testing Hardy-Weinberg equlibrium and testing associations with LRT.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
682 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
683 .B -g
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
684 Call per-sample genotypes at variant sites (force -c)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
685 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
686 .BI -i \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
687 Ratio of INDEL-to-SNP mutation rate [0.15]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
688 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
689 .BI -m \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
690 New model for improved multiallelic and rare-variant calling. Another
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
691 ALT allele is accepted if P(chi^2) of LRT exceeds the FLOAT threshold. The
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
692 parameter seems robust and the actual value usually does not affect the results
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
693 much; a good value to use is 0.99. This is the recommended calling method. [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
694 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
695 .BI -p \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
696 A site is considered to be a variant if P(ref|D)<FLOAT [0.5]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
697 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
698 .BI -P \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
699 Prior or initial allele frequency spectrum. If STR can be
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
700 .IR full ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
701 .IR cond2 ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
702 .I flat
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
703 or the file consisting of error output from a previous variant calling
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
704 run.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
705 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
706 .BI -t \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
707 Scaled muttion rate for variant calling [0.001]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
708 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
709 .BI -T \ STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
710 Enable pair/trio calling. For trio calling, option
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
711 .B -s
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
712 is usually needed to be applied to configure the trio members and their ordering.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
713 In the file supplied to the option
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
714 .BR -s ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
715 the first sample must be the child, the second the father and the third the mother.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
716 The valid values of
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
717 .I STR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
718 are `pair', `trioauto', `trioxd' and `trioxs', where `pair' calls differences between two input samples, and `trioxd' (`trioxs') specifies that the input
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
719 is from the X chromosome non-PAR regions and the child is a female (male). [null]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
720 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
721 .B -v
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
722 Output variant sites only (force -c)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
723 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
724 .B Contrast Calling and Association Test Options:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
725 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
726 .BI -1 \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
727 Number of group-1 samples. This option is used for dividing the samples into
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
728 two groups for contrast SNP calling or association test.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
729 When this option is in use, the following VCF INFO will be outputted:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
730 PC2, PCHI2 and QCHI2. [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
731 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
732 .BI -U \ INT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
733 Number of permutations for association test (effective only with
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
734 .BR -1 )
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
735 [0]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
736 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
737 .BI -X \ FLOAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
738 Only perform permutations for P(chi^2)<FLOAT (effective only with
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
739 .BR -U )
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
740 [0.01]
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
741 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
742
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
743 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
744 .B index
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
745 .B bcftools index
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
746 .I in.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
747
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
748 Index sorted BCF for random access.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
749 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
750
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
751 .TP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
752 .B cat
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
753 .B bcftools cat
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
754 .I in1.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
755 .RI [ "in2.bcf " [ ... "]]]"
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
756
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
757 Concatenate BCF files. The input files are required to be sorted and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
758 have identical samples appearing in the same order.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
759 .RE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
760 .SH SAM FORMAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
761
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
762 Sequence Alignment/Map (SAM) format is TAB-delimited. Apart from the header lines, which are started
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
763 with the `@' symbol, each alignment line consists of:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
764
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
765 .TS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
766 center box;
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
767 cb | cb | cb
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
768 n | l | l .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
769 Col Field Description
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
770 _
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
771 1 QNAME Query template/pair NAME
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
772 2 FLAG bitwise FLAG
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
773 3 RNAME Reference sequence NAME
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
774 4 POS 1-based leftmost POSition/coordinate of clipped sequence
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
775 5 MAPQ MAPping Quality (Phred-scaled)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
776 6 CIAGR extended CIGAR string
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
777 7 MRNM Mate Reference sequence NaMe (`=' if same as RNAME)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
778 8 MPOS 1-based Mate POSistion
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
779 9 TLEN inferred Template LENgth (insert size)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
780 10 SEQ query SEQuence on the same strand as the reference
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
781 11 QUAL query QUALity (ASCII-33 gives the Phred base quality)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
782 12+ OPT variable OPTional fields in the format TAG:VTYPE:VALUE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
783 .TE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
784
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
785 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
786 Each bit in the FLAG field is defined as:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
787
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
788 .TS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
789 center box;
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
790 cb | cb | cb
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
791 l | c | l .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
792 Flag Chr Description
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
793 _
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
794 0x0001 p the read is paired in sequencing
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
795 0x0002 P the read is mapped in a proper pair
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
796 0x0004 u the query sequence itself is unmapped
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
797 0x0008 U the mate is unmapped
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
798 0x0010 r strand of the query (1 for reverse)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
799 0x0020 R strand of the mate
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
800 0x0040 1 the read is the first read in a pair
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
801 0x0080 2 the read is the second read in a pair
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
802 0x0100 s the alignment is not primary
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
803 0x0200 f the read fails platform/vendor quality checks
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
804 0x0400 d the read is either a PCR or an optical duplicate
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
805 .TE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
806
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
807 where the second column gives the string representation of the FLAG field.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
808
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
809 .SH VCF FORMAT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
810
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
811 The Variant Call Format (VCF) is a TAB-delimited format with each data line consists of the following fields:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
812 .TS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
813 center box;
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
814 cb | cb | cb
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
815 n | l | l .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
816 Col Field Description
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
817 _
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
818 1 CHROM CHROMosome name
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
819 2 POS the left-most POSition of the variant
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
820 3 ID unique variant IDentifier
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
821 4 REF the REFerence allele
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
822 5 ALT the ALTernate allele(s), separated by comma
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
823 6 QUAL variant/reference QUALity
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
824 7 FILTER FILTers applied
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
825 8 INFO INFOrmation related to the variant, separated by semi-colon
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
826 9 FORMAT FORMAT of the genotype fields, separated by colon (optional)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
827 10+ SAMPLE SAMPLE genotypes and per-sample information (optional)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
828 .TE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
829
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
830 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
831 The following table gives the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
832 .B INFO
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
833 tags used by samtools and bcftools.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
834
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
835 .TS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
836 center box;
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
837 cb | cb | cb
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
838 l | l | l .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
839 Tag Format Description
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
840 _
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
841 AF1 double Max-likelihood estimate of the site allele frequency (AF) of the first ALT allele
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
842 DP int Raw read depth (without quality filtering)
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
843 DP4 int[4] # high-quality reference forward bases, ref reverse, alternate for and alt rev bases
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
844 FQ int Consensus quality. Positive: sample genotypes different; negative: otherwise
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
845 MQ int Root-Mean-Square mapping quality of covering reads
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
846 PC2 int[2] Phred probability of AF in group1 samples being larger (,smaller) than in group2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
847 PCHI2 double Posterior weighted chi^2 P-value between group1 and group2 samples
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
848 PV4 double[4] P-value for strand bias, baseQ bias, mapQ bias and tail distance bias
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
849 QCHI2 int Phred-scaled PCHI2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
850 RP int # permutations yielding a smaller PCHI2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
851 CLR int Phred log ratio of genotype likelihoods with and without the trio/pair constraint
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
852 UGT string Most probable genotype configuration without the trio constraint
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
853 CGT string Most probable configuration with the trio constraint
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
854 VDB float Tests variant positions within reads. Intended for filtering RNA-seq artifacts around splice sites
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
855 RPB float Mann-Whitney rank-sum test for tail distance bias
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
856 HWE float Hardy-Weinberg equilibrium test, Wigginton et al., PMID: 15789306
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
857 .TE
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
858
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
859 .SH EXAMPLES
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
860 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
861 Import SAM to BAM when
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
862 .B @SQ
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
863 lines are present in the header:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
864
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
865 samtools view -bS aln.sam > aln.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
866
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
867 If
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
868 .B @SQ
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
869 lines are absent:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
870
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
871 samtools faidx ref.fa
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
872 samtools view -bt ref.fa.fai aln.sam > aln.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
873
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
874 where
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
875 .I ref.fa.fai
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
876 is generated automatically by the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
877 .B faidx
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
878 command.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
879
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
880 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
881 Attach the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
882 .B RG
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
883 tag while merging sorted alignments:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
884
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
885 perl -e 'print "@RG\\tID:ga\\tSM:hs\\tLB:ga\\tPL:Illumina\\n@RG\\tID:454\\tSM:hs\\tLB:454\\tPL:454\\n"' > rg.txt
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
886 samtools merge -rh rg.txt merged.bam ga.bam 454.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
887
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
888 The value in a
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
889 .B RG
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
890 tag is determined by the file name the read is coming from. In this
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
891 example, in the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
892 .IR merged.bam ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
893 reads from
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
894 .I ga.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
895 will be attached
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
896 .IR RG:Z:ga ,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
897 while reads from
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
898 .I 454.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
899 will be attached
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
900 .IR RG:Z:454 .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
901
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
902 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
903 Call SNPs and short INDELs for one diploid individual:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
904
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
905 samtools mpileup -ugf ref.fa aln.bam | bcftools view -bvcg - > var.raw.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
906 bcftools view var.raw.bcf | vcfutils.pl varFilter -D 100 > var.flt.vcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
907
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
908 The
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
909 .B -D
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
910 option of varFilter controls the maximum read depth, which should be
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
911 adjusted to about twice the average read depth. One may consider to add
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
912 .B -C50
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
913 to
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
914 .B mpileup
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
915 if mapping quality is overestimated for reads containing excessive
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
916 mismatches. Applying this option usually helps
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
917 .B BWA-short
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
918 but may not other mappers.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
919
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
920 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
921 Generate the consensus sequence for one diploid individual:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
922
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
923 samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
924
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
925 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
926 Call somatic mutations from a pair of samples:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
927
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
928 samtools mpileup -DSuf ref.fa aln.bam | bcftools view -bvcgT pair - > var.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
929
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
930 In the output INFO field,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
931 .I CLR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
932 gives the Phred-log ratio between the likelihood by treating the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
933 two samples independently, and the likelihood by requiring the genotype to be identical.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
934 This
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
935 .I CLR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
936 is effectively a score measuring the confidence of somatic calls. The higher the better.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
937
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
938 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
939 Call de novo and somatic mutations from a family trio:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
940
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
941 samtools mpileup -DSuf ref.fa aln.bam | bcftools view -bvcgT pair -s samples.txt - > var.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
942
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
943 File
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
944 .I samples.txt
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
945 should consist of three lines specifying the member and order of samples (in the order of child-father-mother).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
946 Similarly,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
947 .I CLR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
948 gives the Phred-log likelihood ratio with and without the trio constraint.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
949 .I UGT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
950 shows the most likely genotype configuration without the trio constraint, and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
951 .I CGT
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
952 gives the most likely genotype configuration satisfying the trio constraint.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
953
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
954 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
955 Phase one individual:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
956
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
957 samtools calmd -AEur aln.bam ref.fa | samtools phase -b prefix - > phase.out
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
958
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
959 The
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
960 .B calmd
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
961 command is used to reduce false heterozygotes around INDELs.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
962
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
963 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
964 Call SNPs and short indels for multiple diploid individuals:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
965
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
966 samtools mpileup -P ILLUMINA -ugf ref.fa *.bam | bcftools view -bcvg - > var.raw.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
967 bcftools view var.raw.bcf | vcfutils.pl varFilter -D 2000 > var.flt.vcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
968
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
969 Individuals are identified from the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
970 .B SM
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
971 tags in the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
972 .B @RG
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
973 header lines. Individuals can be pooled in one alignment file; one
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
974 individual can also be separated into multiple files. The
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
975 .B -P
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
976 option specifies that indel candidates should be collected only from
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
977 read groups with the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
978 .B @RG-PL
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
979 tag set to
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
980 .IR ILLUMINA .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
981 Collecting indel candidates from reads sequenced by an indel-prone
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
982 technology may affect the performance of indel calling.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
983
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
984 Note that there is a new calling model which can be invoked by
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
985
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
986 bcftools view -m0.99 ...
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
987
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
988 which fixes some severe limitations of the default method.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
989
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
990 For filtering, best results seem to be achieved by first applying the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
991 .IR SnpGap
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
992 filter and then applying some machine learning approach
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
993
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
994 vcf-annotate -f SnpGap=n
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
995 vcf filter ...
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
996
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
997 Both can be found in the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
998 .B vcftools
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
999 and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1000 .B htslib
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1001 package (links below).
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1002
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1003 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1004 Derive the allele frequency spectrum (AFS) on a list of sites from multiple individuals:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1005
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1006 samtools mpileup -Igf ref.fa *.bam > all.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1007 bcftools view -bl sites.list all.bcf > sites.bcf
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1008 bcftools view -cGP cond2 sites.bcf > /dev/null 2> sites.1.afs
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1009 bcftools view -cGP sites.1.afs sites.bcf > /dev/null 2> sites.2.afs
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1010 bcftools view -cGP sites.2.afs sites.bcf > /dev/null 2> sites.3.afs
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1011 ......
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1012
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1013 where
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1014 .I sites.list
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1015 contains the list of sites with each line consisting of the reference
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1016 sequence name and position. The following
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1017 .B bcftools
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1018 commands estimate AFS by EM.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1019
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1020 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1021 Dump BAQ applied alignment for other SNP callers:
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1022
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1023 samtools calmd -bAr aln.bam > aln.baq.bam
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1024
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1025 It adds and corrects the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1026 .B NM
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1027 and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1028 .B MD
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1029 tags at the same time. The
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1030 .B calmd
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1031 command also comes with the
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1032 .B -C
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1033 option, the same as the one in
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1034 .B pileup
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1035 and
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1036 .BR mpileup .
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1037 Apply if it helps.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1038
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1039 .SH LIMITATIONS
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1040 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1041 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1042 Unaligned words used in bam_import.c, bam_endian.h, bam.c and bam_aux.c.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1043 .IP o 2
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1044 Samtools paired-end rmdup does not work for unpaired reads (e.g. orphan
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1045 reads or ends mapped to different chromosomes). If this is a concern,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1046 please use Picard's MarkDuplicate which correctly handles these cases,
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1047 although a little slower.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1048
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1049 .SH AUTHOR
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1050 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1051 Heng Li from the Sanger Institute wrote the C version of samtools. Bob
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1052 Handsaker from the Broad Institute implemented the BGZF library and Jue
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1053 Ruan from Beijing Genomics Institute wrote the RAZF library. John
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1054 Marshall and Petr Danecek contribute to the source code and various
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1055 people from the 1000 Genomes Project have contributed to the SAM format
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1056 specification.
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1057
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1058 .SH SEE ALSO
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1059 .PP
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1060 Samtools website: <http://samtools.sourceforge.net>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1061 .br
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1062 Samtools latest source: <https://github.com/samtools/samtools>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1063 .br
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1064 VCFtools website with stable link to VCF specification: <http://vcftools.sourceforge.net>
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1065 .br
903fc43d6227 Uploaded
lsong10
parents:
diff changeset
1066 HTSlib website: <https://github.com/samtools/htslib>