annotate bwa-0.6.2/stdaln.h @ 0:dd1186b11b3b draft

Uploaded BWA
author ashvark
date Fri, 18 Jul 2014 07:55:14 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
1 /* The MIT License
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
2
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
3 Copyright (c) 2003-2006, 2008, by Heng Li <lh3lh3@gmail.com>
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
4
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
5 Permission is hereby granted, free of charge, to any person obtaining
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
6 a copy of this software and associated documentation files (the
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
7 "Software"), to deal in the Software without restriction, including
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
8 without limitation the rights to use, copy, modify, merge, publish,
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
9 distribute, sublicense, and/or sell copies of the Software, and to
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
10 permit persons to whom the Software is furnished to do so, subject to
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
11 the following conditions:
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
12
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
13 The above copyright notice and this permission notice shall be
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
14 included in all copies or substantial portions of the Software.
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
15
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
16 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
17 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
18 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
19 NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
20 BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
21 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
22 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
23 SOFTWARE.
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
24 */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
25
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
26 /*
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
27 2009-07-23, 0.10.0
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
28
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
29 - Use 32-bit to store CIGAR
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
30
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
31 - Report suboptimal aligments
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
32
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
33 - Implemented half-fixed-half-open DP
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
34
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
35 2009-04-26, 0.9.10
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
36
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
37 - Allow to set a threshold for local alignment
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
38
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
39 2009-02-18, 0.9.9
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
40
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
41 - Fixed a bug when no residue matches
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
42
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
43 2008-08-04, 0.9.8
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
44
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
45 - Fixed the wrong declaration of aln_stdaln_aux()
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
46
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
47 - Avoid 0 coordinate for global alignment
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
48
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
49 2008-08-01, 0.9.7
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
50
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
51 - Change gap_end penalty to 5 in aln_param_bwa
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
52
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
53 - Add function to convert path_t to the CIGAR format
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
54
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
55 2008-08-01, 0.9.6
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
56
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
57 - The first gap now costs (gap_open+gap_ext), instead of
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
58 gap_open. Scoring systems are modified accordingly.
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
59
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
60 - Gap end is now correctly handled. Previously it is not correct.
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
61
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
62 - Change license to MIT.
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
63
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
64 */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
65
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
66 #ifndef LH3_STDALN_H_
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
67 #define LH3_STDALN_H_
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
68
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
69
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
70 #define STDALN_VERSION 0.11.0
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
71
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
72 #include <stdint.h>
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
73
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
74 #define FROM_M 0
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
75 #define FROM_I 1
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
76 #define FROM_D 2
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
77 #define FROM_S 3
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
78
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
79 #define ALN_TYPE_LOCAL 0
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
80 #define ALN_TYPE_GLOBAL 1
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
81 #define ALN_TYPE_EXTEND 2
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
82
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
83 /* This is the smallest integer. It might be CPU-dependent in very RARE cases. */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
84 #define MINOR_INF -1073741823
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
85
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
86 typedef struct
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
87 {
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
88 int gap_open;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
89 int gap_ext;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
90 int gap_end;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
91
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
92 int *matrix;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
93 int row;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
94 int band_width;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
95 } AlnParam;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
96
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
97 typedef struct
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
98 {
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
99 int i, j;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
100 unsigned char ctype;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
101 } path_t;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
102
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
103 typedef struct
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
104 {
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
105 path_t *path; /* for advanced users... :-) */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
106 int path_len; /* for advanced users... :-) */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
107 int start1, end1; /* start and end of the first sequence, coordinations are 1-based */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
108 int start2, end2; /* start and end of the second sequence, coordinations are 1-based */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
109 int score, subo; /* score */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
110
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
111 char *out1, *out2; /* print them, and then you will know */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
112 char *outm;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
113
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
114 int n_cigar;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
115 uint32_t *cigar32;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
116 } AlnAln;
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
117
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
118 #ifdef __cplusplus
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
119 extern "C" {
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
120 #endif
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
121
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
122 AlnAln *aln_stdaln_aux(const char *seq1, const char *seq2, const AlnParam *ap,
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
123 int type, int do_align, int len1, int len2);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
124 AlnAln *aln_stdaln(const char *seq1, const char *seq2, const AlnParam *ap, int type, int do_align);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
125 void aln_free_AlnAln(AlnAln *aa);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
126
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
127 int aln_global_core(unsigned char *seq1, int len1, unsigned char *seq2, int len2, const AlnParam *ap,
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
128 path_t *path, int *path_len);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
129 int aln_local_core(unsigned char *seq1, int len1, unsigned char *seq2, int len2, const AlnParam *ap,
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
130 path_t *path, int *path_len, int _thres, int *_subo);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
131 int aln_extend_core(unsigned char *seq1, int len1, unsigned char *seq2, int len2, const AlnParam *ap,
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
132 path_t *path, int *path_len, int G0, uint8_t *_mem);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
133 uint16_t *aln_path2cigar(const path_t *path, int path_len, int *n_cigar);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
134 uint32_t *aln_path2cigar32(const path_t *path, int path_len, int *n_cigar);
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
135
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
136 #ifdef __cplusplus
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
137 }
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
138 #endif
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
139
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
140 /********************
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
141 * global variables *
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
142 ********************/
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
143
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
144 extern AlnParam aln_param_bwa; /* = { 37, 9, 0, aln_sm_maq, 5, 50 }; */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
145 extern AlnParam aln_param_blast; /* = { 5, 2, 2, aln_sm_blast, 5, 50 }; */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
146 extern AlnParam aln_param_nt2nt; /* = { 10, 2, 2, aln_sm_nt, 16, 75 }; */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
147 extern AlnParam aln_param_aa2aa; /* = { 20, 19, 19, aln_sm_read, 16, 75 }; */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
148 extern AlnParam aln_param_rd2rd; /* = { 12, 2, 2, aln_sm_blosum62, 22, 50 }; */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
149
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
150 /* common nucleotide score matrix for 16 bases */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
151 extern int aln_sm_nt[], aln_sm_bwa[];
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
152
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
153 /* BLOSUM62 and BLOSUM45 */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
154 extern int aln_sm_blosum62[], aln_sm_blosum45[];
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
155
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
156 /* common read for 16 bases. note that read alignment is quite different from common nucleotide alignment */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
157 extern int aln_sm_read[];
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
158
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
159 /* human-mouse score matrix for 4 bases */
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
160 extern int aln_sm_hs[];
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
161
dd1186b11b3b Uploaded BWA
ashvark
parents:
diff changeset
162 #endif