annotate getLongestORF.py @ 1:1c4b24e9bb16 draft

planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
author mbernt
date Mon, 16 Jul 2018 11:01:52 -0400
parents ec898924d8c7
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
1 #!/usr/bin/env python
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
2
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
3 #example:
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
4 #>STRG.1.1(-)_1 [10 - 69]
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
5 #GGNHHTLGGKKTFSYTHPPC
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
6 #>STRG.1.1(-)_2 [3 - 80]
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
7 #FLRGEPPHIGGKKDIFLHPPTLLKGR
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
8
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
9 #output1: fasta file with all longest ORFs per transcript
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
10 #output2: table with information about seqID, transcript, start, end, strand, length, sense, longest? for all ORFs
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
11
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
12 import sys,re;
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
13
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
14 def findlongestOrf(transcriptDict,old_seqID):
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
15 #write for previous seqID
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
16 prevTranscript = transcriptDict[old_seqID];
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
17 i_max = 0;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
18 transcript = old_seqID.split("(")[0]
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
19
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
20 #find longest orf in transcript
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
21 for i in range(0,len(prevTranscript)):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
22 if(prevTranscript[i][2] >= prevTranscript[i_max][2]):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
23 i_max = i;
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
24
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
25 for i in range(0,len(prevTranscript)):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
26 prevORFstart = prevTranscript[i][0];
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
27 prevORFend = prevTranscript[i][1];
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
28 prevORFlength = prevTranscript[i][2];
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
29 header = prevTranscript[i][3];
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
30 strand = re.search('\(([+-]+)\)',header).group(1);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
31
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
32 output = str(header) + "\t" + str(transcript) + "\t" + str(prevORFstart) + "\t" + str(prevORFend) + "\t" + str(prevORFlength) + "\t" + str(strand);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
33 if (prevORFend - prevORFstart > 0):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
34 output+="\tnormal";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
35 else:
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
36 output+="\treverse_sense";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
37 if(i == i_max):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
38 output += "\ty\n";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
39 else:
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
40 output += "\tn\n";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
41
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
42 OUTPUT_ORF_SUMMARY.write(output);
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
43
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
44 transcriptDict.pop(old_seqID, None);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
45 return None;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
46
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
47 #-----------------------------------------------------------------------------------------------------
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
48
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
49 INPUT = open(sys.argv[1],"r");
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
50 OUTPUT_FASTA = open(sys.argv[2],"w");
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
51 OUTPUT_ORF_SUMMARY = open(sys.argv[3],"w");
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
52
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
53 seqID = "";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
54 old_seqID = "";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
55 lengthDict = {};
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
56 seqDict = {};
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
57 headerDict = {};
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
58 transcriptDict = {};
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
59
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
60 skip = False;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
61
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
62 OUTPUT_ORF_SUMMARY.write("seqID\ttranscript\torf_start\torf_end\tlength\tstrand\tsense\tlongest\n");
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
63
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
64 for line in INPUT:
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
65 line = line.strip();
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
66 if(re.match(">",line)): #header
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
67 header = line.split(">")[1].split(" ")[0]
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
68 seqID = "_".join(line.split(">")[1].split("_")[:-1])
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
69 ORFstart = int (re.search('\ \[(\d+)\ -', line).group(1));
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
70 ORFend = int (re.search('-\ (\d+)\]',line).group(1));
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
71 length = abs(ORFend - ORFstart);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
72
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
73 if(seqID not in transcriptDict and old_seqID != ""): #new transcript
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
74 findlongestOrf(transcriptDict,old_seqID);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
75
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
76 if seqID not in transcriptDict:
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
77 transcriptDict[seqID] = [];
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
78
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
79 transcriptDict[seqID].append([ORFstart,ORFend,length,header]);
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
80
1
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
81 if(seqID not in lengthDict and old_seqID != ""): #new transcript
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
82 #write FASTA
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
83 OUTPUT_FASTA.write(headerDict[old_seqID]+"\n"+seqDict[old_seqID]+"\n");
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
84 #delete old dict entry
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
85 headerDict.pop(old_seqID, None);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
86 seqDict.pop(old_seqID, None);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
87 lengthDict.pop(old_seqID, None);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
88 #if several longest sequences exist with the same length, the dictionary saves the last occuring.
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
89 if(seqID not in lengthDict or length >= lengthDict[seqID]):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
90 headerDict[seqID] = line;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
91 lengthDict[seqID] = length;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
92 seqDict[seqID] = "";
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
93 skip = False;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
94 else:
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
95 skip = True;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
96 next;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
97 old_seqID = seqID;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
98 elif(skip):
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
99 next;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
100 else:
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
101 seqDict[seqID] += line;
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
102
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
103 OUTPUT_FASTA.write(headerDict[old_seqID]+"\n"+seqDict[old_seqID]);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
104 findlongestOrf(transcriptDict,old_seqID);
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
105
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
106 INPUT.close();
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
107 OUTPUT_FASTA.close();
1c4b24e9bb16 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 5be33ea99532ab3abb000564af4c63c81c4ccd87
mbernt
parents: 0
diff changeset
108 OUTPUT_ORF_SUMMARY.close();