annotate MISA/p3_misa_parameter.pl @ 0:3006582bfc76

Uploaded V1.0 MISA tools and helper scripts
author john-mccallum
date Wed, 14 Sep 2011 23:57:57 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
1 #!/usr/bin/perl -w
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
2 # Author: Thomas Thiel
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
3 # Program name: misa.pl
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
4
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
5 ###_______________________________________________________________________________
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
6 ###
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
7 ###Program name:p3_ misa_parameter.pl
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
8 ###Author: Thomas Thiel
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
9 ###Release date: 14/12/01 (version 1.0)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
10 ###
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
11 ###_______________________________________________________________________________
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
12 ###
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
13 ## _______________________________________________________________________________
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
14 ##
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
15 ## DESCRIPTION: Tool for the identification and localization of
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
16 ## (I) perfect microsatellites as well as
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
17 ## (II) compound microsatellites (two individual microsatellites,
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
18 ## disrupted by a certain number of bases)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
19 ##
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
20 ## SYNTAX: misa.pl <FASTA file>
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
21 ##
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
22 ## <FASTAfile> Single file in FASTA format containing the sequence(s).
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
23 ##
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
24 ## In order to specify the search criteria, an additional file containing
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
25 ## the microsatellite search parameters is required named "misa.ini", which
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
26 ## has the following structure:
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
27 ## (a) Following a text string beginning with 'def', pairs of numbers are
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
28 ## expected, whereas the first number defines the unit size and the
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
29 ## second number the lower threshold of repeats for that specific unit.
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
30 ## (b) Following a text string beginning with 'int' a single number defines
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
31 ## the maximal number of bases between two adjacent microsatellites in
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
32 ## order to specify the compound microsatellite type.
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
33 ## Example:
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
34 ## definition(unit_size,min_repeats): 1-10 2-6 3-5 4-5 5-5 6-5
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
35 ## interruptions(max_difference_for_2_SSRs): 100
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
36 ##
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
37 ## EXAMPLE: misa.pl seqs.fasta
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
38 ## Modified by Leshi Chen for primer design
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
39 ## _______________________________________________________________________________
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
40 ##
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
41
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
42
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
43 #§§§§§ DECLARATION §§§§§#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
44
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
45 # Check for arguments. If none display syntax #
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
46
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
47
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
48 if (@ARGV == 0)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
49 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
50 open (IN,"<$0");
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
51 while (<IN>) {if (/^\#\# (.*)/) {$message .= "$1\n"}};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
52 close (IN);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
53 die $message;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
54 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
55
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
56 # Check if help is required #
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
57
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
58 if ($ARGV[0] =~ /-help/i)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
59 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
60 open (IN,"<$0");
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
61 while (<IN>) {if (/^\#\#\#(.*)/) {$message .= "$1\n"}};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
62 close (IN);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
63 die $message;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
64 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
65
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
66 # Open FASTA file #
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
67
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
68 open (IN,"<$ARGV[0]") || die ("\nError: FASTA file doesn't exist !\n\n");
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
69 #open (OUT,">$ARGV[0].misa"); updated by Leshi chen for galaxy integration
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
70 open (OUT,">$ARGV[1]");
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
71 print OUT "ID\tSSR nr.\tSSR type\tSSR\tsize\tstart\tend\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
72
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
73 # Reading arguments updated by Leshi chen to get local path otherwise will create error #
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
74 #use Cwd 'abs_path';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
75 #use Cwd 'getcwd';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
76 #print getcwd()&"misa.ini";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
77 #print OUT abs_path($0);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
78 #open (SPECS,"\/root\/galaxy_dist\/tools\/pfr_2010\/"."misa.ini") || die ("\nError: Specifications file doesn't exist ! \n\n misa.ini not found ! \n\n");
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
79 my $arg_def= $ARGV[2]||'';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
80 my $arg_interuption= $ARGV[3]||'';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
81 #my $tmb = '';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
82 #my $_ = '';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
83 my %typrep;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
84 my $amb = 0;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
85
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
86 %typrep = $arg_def =~/(\d+)-(\d+)/gi;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
87 #print "1:" , $arg_def , "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
88 #print "hh: ", %typrep , "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
89 #print $arg_def , "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
90 #print $arg_interuption ,"\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
91 #print $arg_def =~/(\d+)/gi , "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
92 #%typrep = $arg_def =~/(\d+)/gi;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
93 print %typrep , "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
94 $amb = $arg_interuption;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
95 print $amb , "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
96 #while (<SPECS>)#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
97 # {#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
98 # %typrep = $1 =~ /(\d+)/gi if (/^def\S*\s+(.*)/i);#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
99 # if (/^int\S*\s+(\d+)/i) {$amb = $1}#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
100 # };#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
101 my @typ = sort { $a <=> $b } keys %typrep;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
102 print @typ . "\n";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
103 #die (%typrep , "--" , @typ , "--" , $amb);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
104 #§§§§§ CORE §§§§§#
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
105
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
106 $/ = ">";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
107 my $max_repeats = 1; #count repeats
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
108 my $min_repeats = 1000; #count repeats
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
109 my (%count_motif,%count_class); #count
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
110 my ($number_sequences,$size_sequences,%ssr_containing_seqs); #stores number and size of all sequences examined
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
111 my $ssr_in_compound = 0;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
112 my ($id,$seq);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
113 while (<IN>)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
114 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
115 next unless (($id,$seq) = /(.*?)\n(.*)/s);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
116 my ($nr,%start,@order,%end,%motif,%repeats); # store info of all SSRs from each sequence
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
117 $seq =~ s/[\d\s>]//g; #remove digits, spaces, line breaks,...
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
118 $id =~ s/^\s*//g; $id =~ s/\s*$//g;$id =~ s/\s/_/g; #replace whitespace with "_"
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
119 $number_sequences++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
120 $size_sequences += length $seq;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
121 for ($i=0; $i < scalar(@typ); $i++) #check each motif class
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
122 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
123 my $motiflen = $typ[$i];
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
124 my $minreps = $typrep{$typ[$i]} - 1;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
125 if ($min_repeats > $typrep{$typ[$i]}) {$min_repeats = $typrep{$typ[$i]}}; #count repeats
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
126 my $search = "(([acgt]{$motiflen})\\2{$minreps,})";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
127 while ( $seq =~ /$search/ig ) #scan whole sequence for that class
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
128 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
129 my $motif = uc $2;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
130 my $redundant; #reject false type motifs [e.g. (TT)6 or (ACAC)5]
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
131 for ($j = $motiflen - 1; $j > 0; $j--)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
132 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
133 my $redmotif = "([ACGT]{$j})\\1{".($motiflen/$j-1)."}";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
134 $redundant = 1 if ( $motif =~ /$redmotif/ )
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
135 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
136 next if $redundant;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
137 $motif{++$nr} = $motif;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
138 my $ssr = uc $1;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
139 $repeats{$nr} = length($ssr) / $motiflen;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
140 $end{$nr} = pos($seq);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
141 $start{$nr} = $end{$nr} - length($ssr) + 1;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
142 # count repeats
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
143 # count_motifs doesn't required as statistic has been removed - modified by leshi
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
144 #$count_motifs{$motif{$nr}}++; #counts occurrence of individual motifs
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
145 $motif{$nr}->{$repeats{$nr}}++; #counts occurrence of specific SSR in its appearing repeat
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
146 $count_class{$typ[$i]}++; #counts occurrence in each motif class
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
147 if ($max_repeats < $repeats{$nr}) {$max_repeats = $repeats{$nr}};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
148 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
149 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
150 next if (!$nr); #no SSRs
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
151 $ssr_containing_seqs{$nr}++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
152 @order = sort { $start{$a} <=> $start{$b} } keys %start; #put SSRs in right order
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
153 $i = 0;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
154 my $count_seq; #counts
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
155 my ($start,$end,$ssrseq,$ssrtype,$size);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
156 while ($i < $nr)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
157 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
158 my $space = $amb + 1;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
159 if (!$order[$i+1]) #last or only SSR
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
160 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
161 $count_seq++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
162 my $motiflen = length ($motif{$order[$i]});
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
163 $ssrtype = "p".$motiflen;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
164 $ssrseq = "($motif{$order[$i]})$repeats{$order[$i]}";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
165 $start = $start{$order[$i]}; $end = $end{$order[$i++]};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
166 next
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
167 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
168 if (($start{$order[$i+1]} - $end{$order[$i]}) > $space)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
169 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
170 $count_seq++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
171 my $motiflen = length ($motif{$order[$i]});
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
172 $ssrtype = "p".$motiflen;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
173 $ssrseq = "($motif{$order[$i]})$repeats{$order[$i]}";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
174 $start = $start{$order[$i]}; $end = $end{$order[$i++]};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
175 next
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
176 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
177 my ($interssr);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
178 if (($start{$order[$i+1]} - $end{$order[$i]}) < 1)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
179 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
180 $count_seq++; $ssr_in_compound++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
181 $ssrtype = 'c*';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
182 $ssrseq = "($motif{$order[$i]})$repeats{$order[$i]}($motif{$order[$i+1]})$repeats{$order[$i+1]}*";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
183 $start = $start{$order[$i]}; $end = $end{$order[$i+1]}
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
184 }
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
185 else
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
186 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
187 $count_seq++; $ssr_in_compound++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
188 $interssr = lc substr($seq,$end{$order[$i]},($start{$order[$i+1]} - $end{$order[$i]}) - 1);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
189 $ssrtype = 'c';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
190 $ssrseq = "($motif{$order[$i]})$repeats{$order[$i]}$interssr($motif{$order[$i+1]})$repeats{$order[$i+1]}";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
191 $start = $start{$order[$i]}; $end = $end{$order[$i+1]};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
192 #$space -= length $interssr
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
193 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
194 while ($order[++$i + 1] and (($start{$order[$i+1]} - $end{$order[$i]}) <= $space))
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
195 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
196 if (($start{$order[$i+1]} - $end{$order[$i]}) < 1)
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
197 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
198 $ssr_in_compound++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
199 $ssrseq .= "($motif{$order[$i+1]})$repeats{$order[$i+1]}*";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
200 $ssrtype = 'c*';
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
201 $end = $end{$order[$i+1]}
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
202 }
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
203 else
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
204 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
205 $ssr_in_compound++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
206 $interssr = lc substr($seq,$end{$order[$i]},($start{$order[$i+1]} - $end{$order[$i]}) - 1);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
207 $ssrseq .= "$interssr($motif{$order[$i+1]})$repeats{$order[$i+1]}";
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
208 $end = $end{$order[$i+1]};
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
209 #$space -= length $interssr
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
210 }
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
211 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
212 $i++;
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
213 }
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
214 continue
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
215 {
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
216 print OUT "$id\t$count_seq\t$ssrtype\t$ssrseq\t",($end - $start + 1),"\t$start\t$end\n"
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
217 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
218 };
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
219
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
220 close (OUT);
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
221 #open (OUT,">$ARGV[0].statistics"); updated by Leshi chen for galaxy integration
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
222 # the statistics part has been removed as we only need misa for primer
3006582bfc76 Uploaded V1.0 MISA tools and helper scripts
john-mccallum
parents:
diff changeset
223