annotate getLongestORF.py @ 0:ec898924d8c7 draft

planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
author mbernt
date Wed, 20 Jun 2018 11:02:06 -0400
parents
children 1c4b24e9bb16
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
1 #!/usr/bin/env python
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
2
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
3 """
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
4 usage: getLongestORF.py input output.fas output.tab
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
5
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
6
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
7 input.fas: a amino acid fasta file of all open reading frames (ORF) listed by transcript (output of GalaxyTool "getorf")
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
8 output.fas: fasta file with all longest ORFs per transcript
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
9 output.tab: table with information about seqID, start, end, length, orientation, longest for all ORFs
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
10
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
11 example:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
12
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
13 >253936-254394(+)_1 [28 - 63]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
14 LTNYCQMVHNIL
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
15 >253936-254394(+)_2 [18 - 77]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
16 HKLIDKLLPNGAQYFVKSTQ
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
17 >253936-254394(+)_3 [32 - 148]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
18 QTTAKWCTIFCKKYPVAPFHTMYLNYAVTWHHRSLLVAV
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
19 >253936-254394(+)_4 [117 - 152]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
20 LGIIVPSLLLCN
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
21 >248351-252461(+)_1 [14 - 85]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
22 VLARKYPRCLSPSKKSPCQLRQRS
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
23 >248351-252461(+)_2 [21 - 161]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
24 PGNTHDASAHRKSLRVNSDKEVKCLFTKNAASEHPDHKRRRVSEHVP
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
25 >248351-252461(+)_3 [89 - 202]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
26 VPLHQECCIGAPRPQTTACVRACAMTNTPRSSMTSKTG
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
27 >248351-252461(+)_4 [206 - 259]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
28 SRTTSGRQSVLSEKLWRR
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
29 >248351-252461(+)_5 [263 - 313]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
30 CLSPLWVPCCSRHSCHG
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
31 """
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
32
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
33 import sys,re
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
34
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
35 def findlongestOrf(transcriptDict,old_seqID):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
36 #write for previous seqID
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
37 prevTranscript = transcriptDict[old_seqID]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
38 i_max = 0
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
39 #find longest orf in transcript
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
40 for i in range(0,len(prevTranscript)):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
41 if(prevTranscript[i][2] >= prevTranscript[i_max][2]):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
42 i_max = i
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
43 for i in range(0,len(prevTranscript)):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
44 prevStart = prevTranscript[i][0]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
45 prevEnd = prevTranscript[i][1]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
46 prevLength = prevTranscript[i][2]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
47 output = str(old_seqID) + "\t" + str(prevStart) + "\t" + str(prevEnd) + "\t" + str(prevLength)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
48 if (end - start > 0):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
49 output+="\tForward"
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
50 else:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
51 output+="\tReverse"
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
52 if(i == i_max):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
53 output += "\ty\n"
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
54 else:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
55 output += "\tn\n"
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
56 OUTPUT_ORF_SUMMARY.write(output)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
57 transcriptDict.pop(old_seqID, None)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
58 return None
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
59
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
60 INPUT = open(sys.argv[1],"r")
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
61 OUTPUT_FASTA = open(sys.argv[2],"w")
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
62 OUTPUT_ORF_SUMMARY = open(sys.argv[3],"w")
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
63
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
64 seqID = ""
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
65 old_seqID = ""
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
66 lengthDict = {}
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
67 seqDict = {}
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
68 headerDict = {}
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
69 transcriptDict = {}
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
70 skip = False
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
71
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
72 OUTPUT_ORF_SUMMARY.write("seqID\tstart\tend\tlength\torientation\tlongest\n")
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
73
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
74 for line in INPUT:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
75 line = line.strip()
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
76 # print line
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
77 if(re.match(">",line)): #header
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
78 seqID = "_".join(line.split(">")[1].split("_")[:-1])
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
79 #seqID = line.split(">")[1].split("_")[0]
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
80 start = int (re.search('\ \[(\d+)\ -', line).group(1))
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
81 end = int (re.search('-\ (\d+)\]',line).group(1))
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
82 length = abs(end - start)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
83 if(seqID not in transcriptDict and old_seqID != ""): #new transcript
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
84 findlongestOrf(transcriptDict,old_seqID)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
85 if seqID not in transcriptDict:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
86 transcriptDict[seqID] = []
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
87 transcriptDict[seqID].append([start,end,length])
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
88 if(seqID not in lengthDict and old_seqID != ""): #new transcript
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
89 #write FASTA
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
90 OUTPUT_FASTA.write(headerDict[old_seqID]+"\n"+seqDict[old_seqID]+"\n")
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
91 #delete old dict entry
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
92 headerDict.pop(old_seqID, None)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
93 seqDict.pop(old_seqID, None)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
94 lengthDict.pop(old_seqID, None)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
95 #if several longest sequences exist with the same length, the dictionary saves the last occuring.
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
96 if(seqID not in lengthDict or length >= lengthDict[seqID]):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
97 headerDict[seqID] = line
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
98 lengthDict[seqID] = length
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
99 seqDict[seqID] = ""
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
100 skip = False
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
101 else:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
102 skip = True
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
103 next
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
104 old_seqID = seqID
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
105 elif(skip):
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
106 next
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
107 else:
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
108 seqDict[seqID] += line
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
109
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
110 OUTPUT_FASTA.write(headerDict[old_seqID]+"\n"+seqDict[old_seqID])
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
111 findlongestOrf(transcriptDict,old_seqID)
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
112 INPUT.close()
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
113 OUTPUT_FASTA.close()
ec898924d8c7 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
114 OUTPUT_ORF_SUMMARY.close()