annotate SNV/SNVMix2_source/SNVMix2-v0.12.1-rc1/samtools-0.1.6/samtools.1 @ 1:6d4997bc1c18

Uploaded
author ryanmorin
date Wed, 12 Oct 2011 19:53:45 -0400
parents 74f5ea818cea
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
1 .TH samtools 1 "2 September 2009" "samtools-0.1.6" "Bioinformatics tools"
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
2 .SH NAME
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
3 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
4 samtools - Utilities for the Sequence Alignment/Map (SAM) format
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
5 .SH SYNOPSIS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
6 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
7 samtools view -bt ref_list.txt -o aln.bam aln.sam.gz
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
8 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
9 samtools sort aln.bam aln.sorted
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
10 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
11 samtools index aln.sorted.bam
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
12 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
13 samtools view aln.sorted.bam chr2:20,100,000-20,200,000
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
14 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
15 samtools merge out.bam in1.bam in2.bam in3.bam
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
16 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
17 samtools faidx ref.fasta
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
18 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
19 samtools pileup -f ref.fasta aln.sorted.bam
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
20 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
21 samtools tview aln.sorted.bam ref.fasta
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
22
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
23 .SH DESCRIPTION
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
24 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
25 Samtools is a set of utilities that manipulate alignments in the BAM
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
26 format. It imports from and exports to the SAM (Sequence Alignment/Map)
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
27 format, does sorting, merging and indexing, and allows to retrieve reads
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
28 in any regions swiftly.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
29
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
30 Samtools is designed to work on a stream. It regards an input file `-'
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
31 as the standard input (stdin) and an output file `-' as the standard
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
32 output (stdout). Several commands can thus be combined with Unix
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
33 pipes. Samtools always output warning and error messages to the standard
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
34 error output (stderr).
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
35
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
36 Samtools is also able to open a BAM (not SAM) file on a remote FTP or
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
37 HTTP server if the BAM file name starts with `ftp://' or `http://'.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
38 Samtools checks the current working directory for the index file and
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
39 will download the index upon absence. Samtools does not retrieve the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
40 entire alignment file unless it is asked to do so.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
41
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
42 .SH COMMANDS AND OPTIONS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
43
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
44 .TP 10
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
45 .B import
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
46 samtools import <in.ref_list> <in.sam> <out.bam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
47
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
48 Since 0.1.4, this command is an alias of:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
49
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
50 samtools view -bt <in.ref_list> -o <out.bam> <in.sam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
51
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
52 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
53 .B sort
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
54 samtools sort [-n] [-m maxMem] <in.bam> <out.prefix>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
55
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
56 Sort alignments by leftmost coordinates. File
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
57 .I <out.prefix>.bam
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
58 will be created. This command may also create temporary files
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
59 .I <out.prefix>.%d.bam
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
60 when the whole alignment cannot be fitted into memory (controlled by
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
61 option -m).
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
62
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
63 .B OPTIONS:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
64 .RS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
65 .TP 8
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
66 .B -n
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
67 Sort by read names rather than by chromosomal coordinates
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
68 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
69 .B -m INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
70 Approximately the maximum required memory. [500000000]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
71 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
72
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
73 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
74 .B merge
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
75 samtools merge [-h inh.sam] [-n] <out.bam> <in1.bam> <in2.bam> [...]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
76
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
77 Merge multiple sorted alignments.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
78 The header reference lists of all the input BAM files, and the @SQ headers of
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
79 .IR inh.sam ,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
80 if any, must all refer to the same set of reference sequences.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
81 The header reference list and (unless overridden by
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
82 .BR -h )
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
83 `@' headers of
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
84 .I in1.bam
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
85 will be copied to
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
86 .IR out.bam ,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
87 and the headers of other files will be ignored.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
88
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
89 .B OPTIONS:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
90 .RS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
91 .TP 8
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
92 .B -h FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
93 Use the lines of
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
94 .I FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
95 as `@' headers to be copied to
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
96 .IR out.bam ,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
97 replacing any header lines that would otherwise be copied from
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
98 .IR in1.bam .
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
99 .RI ( FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
100 is actually in SAM format, though any alignment records it may contain
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
101 are ignored.)
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
102 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
103 .B -n
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
104 The input alignments are sorted by read names rather than by chromosomal
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
105 coordinates
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
106 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
107
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
108 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
109 .B index
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
110 samtools index <aln.bam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
111
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
112 Index sorted alignment for fast random access. Index file
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
113 .I <aln.bam>.bai
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
114 will be created.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
115
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
116 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
117 .B view
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
118 samtools view [-bhuHS] [-t in.refList] [-o output] [-f reqFlag] [-F
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
119 skipFlag] [-q minMapQ] [-l library] [-r readGroup] <in.bam>|<in.sam> [region1 [...]]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
120
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
121 Extract/print all or sub alignments in SAM or BAM format. If no region
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
122 is specified, all the alignments will be printed; otherwise only
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
123 alignments overlapping the specified regions will be output. An
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
124 alignment may be given multiple times if it is overlapping several
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
125 regions. A region can be presented, for example, in the following
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
126 format: `chr2', `chr2:1000000' or `chr2:1,000,000-2,000,000'. The
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
127 coordinate is 1-based.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
128
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
129 .B OPTIONS:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
130 .RS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
131 .TP 8
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
132 .B -b
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
133 Output in the BAM format.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
134 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
135 .B -u
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
136 Output uncompressed BAM. This option saves time spent on
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
137 compression/decomprssion and is thus preferred when the output is piped
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
138 to another samtools command.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
139 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
140 .B -h
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
141 Include the header in the output.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
142 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
143 .B -H
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
144 Output the header only.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
145 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
146 .B -S
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
147 Input is in SAM. If @SQ header lines are absent, the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
148 .B `-t'
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
149 option is required.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
150 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
151 .B -t FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
152 This file is TAB-delimited. Each line must contain the reference name
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
153 and the length of the reference, one line for each distinct reference;
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
154 additional fields are ignored. This file also defines the order of the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
155 reference sequences in sorting. If you run `samtools faidx <ref.fa>',
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
156 the resultant index file
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
157 .I <ref.fa>.fai
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
158 can be used as this
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
159 .I <in.ref_list>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
160 file.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
161 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
162 .B -o FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
163 Output file [stdout]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
164 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
165 .B -f INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
166 Only output alignments with all bits in INT present in the FLAG
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
167 field. INT can be in hex in the format of /^0x[0-9A-F]+/ [0]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
168 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
169 .B -F INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
170 Skip alignments with bits present in INT [0]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
171 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
172 .B -q INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
173 Skip alignments with MAPQ smaller than INT [0]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
174 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
175 .B -l STR
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
176 Only output reads in library STR [null]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
177 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
178 .B -r STR
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
179 Only output reads in read group STR [null]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
180 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
181
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
182 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
183 .B faidx
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
184 samtools faidx <ref.fasta> [region1 [...]]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
185
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
186 Index reference sequence in the FASTA format or extract subsequence from
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
187 indexed reference sequence. If no region is specified,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
188 .B faidx
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
189 will index the file and create
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
190 .I <ref.fasta>.fai
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
191 on the disk. If regions are speficified, the subsequences will be
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
192 retrieved and printed to stdout in the FASTA format. The input file can
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
193 be compressed in the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
194 .B RAZF
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
195 format.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
196
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
197 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
198 .B pileup
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
199 samtools pileup [-f in.ref.fasta] [-t in.ref_list] [-l in.site_list]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
200 [-iscgS2] [-T theta] [-N nHap] [-r pairDiffRate] <in.bam>|<in.sam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
201
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
202 Print the alignment in the pileup format. In the pileup format, each
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
203 line represents a genomic position, consisting of chromosome name,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
204 coordinate, reference base, read bases, read qualities and alignment
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
205 mapping qualities. Information on match, mismatch, indel, strand,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
206 mapping quality and start and end of a read are all encoded at the read
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
207 base column. At this column, a dot stands for a match to the reference
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
208 base on the forward strand, a comma for a match on the reverse strand,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
209 `ACGTN' for a mismatch on the forward strand and `acgtn' for a mismatch
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
210 on the reverse strand. A pattern `\\+[0-9]+[ACGTNacgtn]+' indicates
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
211 there is an insertion between this reference position and the next
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
212 reference position. The length of the insertion is given by the integer
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
213 in the pattern, followed by the inserted sequence. Similarly, a pattern
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
214 `-[0-9]+[ACGTNacgtn]+' represents a deletion from the reference. The
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
215 deleted bases will be presented as `*' in the following lines. Also at
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
216 the read base column, a symbol `^' marks the start of a read segment
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
217 which is a contiguous subsequence on the read separated by `N/S/H' CIGAR
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
218 operations. The ASCII of the character following `^' minus 33 gives the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
219 mapping quality. A symbol `$' marks the end of a read segment.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
220
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
221 If option
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
222 .B -c
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
223 is applied, the consensus base, consensus quality, SNP quality and RMS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
224 mapping quality of the reads covering the site will be inserted between
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
225 the `reference base' and the `read bases' columns. An indel occupies an
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
226 additional line. Each indel line consists of chromosome name,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
227 coordinate, a star, the genotype, consensus quality, SNP quality, RMS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
228 mapping quality, # covering reads, the first alllele, the second allele,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
229 # reads supporting the first allele, # reads supporting the second
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
230 allele and # reads containing indels different from the top two alleles.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
231
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
232 .B OPTIONS:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
233 .RS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
234
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
235 .TP 10
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
236 .B -s
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
237 Print the mapping quality as the last column. This option makes the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
238 output easier to parse, although this format is not space efficient.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
239
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
240 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
241 .B -S
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
242 The input file is in SAM.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
243
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
244 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
245 .B -i
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
246 Only output pileup lines containing indels.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
247
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
248 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
249 .B -f FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
250 The reference sequence in the FASTA format. Index file
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
251 .I FILE.fai
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
252 will be created if
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
253 absent.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
254
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
255 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
256 .B -M INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
257 Cap mapping quality at INT [60]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
258
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
259 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
260 .B -t FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
261 List of reference names ane sequence lengths, in the format described
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
262 for the
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
263 .B import
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
264 command. If this option is present, samtools assumes the input
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
265 .I <in.alignment>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
266 is in SAM format; otherwise it assumes in BAM format.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
267
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
268 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
269 .B -l FILE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
270 List of sites at which pileup is output. This file is space
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
271 delimited. The first two columns are required to be chromosome and
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
272 1-based coordinate. Additional columns are ignored. It is
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
273 recommended to use option
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
274 .B -s
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
275 together with
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
276 .B -l
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
277 as in the default format we may not know the mapping quality.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
278
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
279 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
280 .B -c
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
281 Call the consensus sequence using MAQ consensus model. Options
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
282 .B -T,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
283 .B -N,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
284 .B -I
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
285 and
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
286 .B -r
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
287 are only effective when
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
288 .B -c
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
289 or
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
290 .B -g
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
291 is in use.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
292
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
293 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
294 .B -g
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
295 Generate genotype likelihood in the binary GLFv3 format. This option
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
296 suppresses -c, -i and -s.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
297
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
298 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
299 .B -T FLOAT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
300 The theta parameter (error dependency coefficient) in the maq consensus
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
301 calling model [0.85]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
302
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
303 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
304 .B -N INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
305 Number of haplotypes in the sample (>=2) [2]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
306
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
307 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
308 .B -r FLOAT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
309 Expected fraction of differences between a pair of haplotypes [0.001]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
310
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
311 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
312 .B -I INT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
313 Phred probability of an indel in sequencing/prep. [40]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
314
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
315 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
316
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
317 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
318 .B tview
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
319 samtools tview <in.sorted.bam> [ref.fasta]
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
320
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
321 Text alignment viewer (based on the ncurses library). In the viewer,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
322 press `?' for help and press `g' to check the alignment start from a
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
323 region in the format like `chr10:10,000,000'.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
324
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
325 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
326
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
327 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
328 .B fixmate
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
329 samtools fixmate <in.nameSrt.bam> <out.bam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
330
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
331 Fill in mate coordinates, ISIZE and mate related flags from a
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
332 name-sorted alignment.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
333
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
334 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
335 .B rmdup
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
336 samtools rmdup <input.srt.bam> <out.bam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
337
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
338 Remove potential PCR duplicates: if multiple read pairs have identical
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
339 external coordinates, only retain the pair with highest mapping quality.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
340 This command
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
341 .B ONLY
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
342 works with FR orientation and requires ISIZE is correctly set.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
343
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
344 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
345
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
346 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
347 .B rmdupse
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
348 samtools rmdupse <input.srt.bam> <out.bam>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
349
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
350 Remove potential duplicates for single-ended reads. This command will
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
351 treat all reads as single-ended even if they are paired in fact.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
352
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
353 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
354
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
355 .TP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
356 .B fillmd
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
357 samtools fillmd [-e] <aln.bam> <ref.fasta>
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
358
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
359 Generate the MD tag. If the MD tag is already present, this command will
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
360 give a warning if the MD tag generated is different from the existing
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
361 tag.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
362
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
363 .B OPTIONS:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
364 .RS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
365 .TP 8
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
366 .B -e
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
367 Convert a the read base to = if it is identical to the aligned reference
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
368 base. Indel caller does not support the = bases at the moment.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
369
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
370 .RE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
371
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
372 .SH SAM FORMAT
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
373
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
374 SAM is TAB-delimited. Apart from the header lines, which are started
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
375 with the `@' symbol, each alignment line consists of:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
376
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
377 .TS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
378 center box;
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
379 cb | cb | cb
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
380 n | l | l .
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
381 Col Field Description
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
382 _
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
383 1 QNAME Query (pair) NAME
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
384 2 FLAG bitwise FLAG
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
385 3 RNAME Reference sequence NAME
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
386 4 POS 1-based leftmost POSition/coordinate of clipped sequence
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
387 5 MAPQ MAPping Quality (Phred-scaled)
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
388 6 CIAGR extended CIGAR string
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
389 7 MRNM Mate Reference sequence NaMe (`=' if same as RNAME)
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
390 8 MPOS 1-based Mate POSistion
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
391 9 ISIZE Inferred insert SIZE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
392 10 SEQ query SEQuence on the same strand as the reference
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
393 11 QUAL query QUALity (ASCII-33 gives the Phred base quality)
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
394 12 OPT variable OPTional fields in the format TAG:VTYPE:VALUE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
395 .TE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
396
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
397 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
398 Each bit in the FLAG field is defined as:
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
399
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
400 .TS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
401 center box;
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
402 cb | cb
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
403 l | l .
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
404 Flag Description
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
405 _
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
406 0x0001 the read is paired in sequencing
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
407 0x0002 the read is mapped in a proper pair
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
408 0x0004 the query sequence itself is unmapped
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
409 0x0008 the mate is unmapped
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
410 0x0010 strand of the query (1 for reverse)
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
411 0x0020 strand of the mate
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
412 0x0040 the read is the first read in a pair
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
413 0x0080 the read is the second read in a pair
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
414 0x0100 the alignment is not primary
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
415 0x0200 the read fails platform/vendor quality checks
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
416 0x0400 the read is either a PCR or an optical duplicate
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
417 .TE
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
418
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
419 .SH LIMITATIONS
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
420 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
421 .IP o 2
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
422 Unaligned words used in bam_import.c, bam_endian.h, bam.c and bam_aux.c.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
423 .IP o 2
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
424 CIGAR operation P is not properly handled at the moment.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
425 .IP o 2
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
426 In merging, the input files are required to have the same number of
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
427 reference sequences. The requirement can be relaxed. In addition,
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
428 merging does not reconstruct the header dictionaries
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
429 automatically. Endusers have to provide the correct header. Picard is
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
430 better at merging.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
431 .IP o 2
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
432 Samtools' rmdup does not work for single-end data and does not remove
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
433 duplicates across chromosomes. Picard is better.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
434
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
435 .SH AUTHOR
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
436 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
437 Heng Li from the Sanger Institute wrote the C version of samtools. Bob
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
438 Handsaker from the Broad Institute implemented the BGZF library and Jue
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
439 Ruan from Beijing Genomics Institute wrote the RAZF library. Various
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
440 people in the 1000Genomes Project contributed to the SAM format
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
441 specification.
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
442
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
443 .SH SEE ALSO
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
444 .PP
74f5ea818cea Uploaded
ryanmorin
parents:
diff changeset
445 Samtools website: <http://samtools.sourceforge.net>