annotate fastaregexfinder.py @ 0:269c627ae9f4 draft

planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
author mbernt
date Wed, 20 Jun 2018 11:06:57 -0400
parents
children 9a811adb714f
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
1 #!/usr/bin/env python
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
2
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
3 import re
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
4 import sys
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
5 import string
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
6 import argparse
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
7 import operator
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
8
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
9 VERSION='0.1.1'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
10
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
11 parser = argparse.ArgumentParser(description="""
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
12
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
13 DESCRIPTION
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
14
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
15 Search a fasta file for matches to a regex and return a bed file with the
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
16 coordinates of the match and the matched sequence itself.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
17
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
18 Output bed file has columns:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
19 1. Name of fasta sequence (e.g. chromosome)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
20 2. Start of the match
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
21 3. End of the match
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
22 4. ID of the match
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
23 5. Length of the match
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
24 6. Strand
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
25 7. Matched sequence as it appears on the forward strand
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
26
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
27 For matches on the reverse strand it is reported the start and end position on the
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
28 forward strand and the matched string on the forward strand (so the G4 'GGGAGGGT'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
29 present on the reverse strand is reported as ACCCTCCC).
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
30
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
31 Note: Fasta sequences (chroms) are read in memory one at a time along with the
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
32 matches for that chromosome.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
33 The order of the output is: chroms as they are found in the inut fasta, matches
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
34 sorted within chroms by positions.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
35
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
36 EXAMPLE:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
37 ## Test data:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
38 echo '>mychr' > /tmp/mychr.fa
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
39 echo 'ACTGnACTGnACTGnTGAC' >> /tmp/mychr.fa
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
40
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
41 fastaRegexFinder.py -f /tmp/mychr.fa -r 'ACTG'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
42 mychr 0 4 mychr_0_4_for 4 + ACTG
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
43 mychr 5 9 mychr_5_9_for 4 + ACTG
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
44 mychr 10 14 mychr_10_14_for 4 + ACTG
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
45
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
46 fastaRegexFinder.py -f /tmp/mychr.fa -r 'ACTG' --maxstr 3
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
47 mychr 0 4 mychr_0_4_for 4 + ACT[3,4]
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
48 mychr 5 9 mychr_5_9_for 4 + ACT[3,4]
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
49 mychr 10 14 mychr_10_14_for 4 + ACT[3,4]
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
50
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
51 less /tmp/mychr.fa | fastaRegexFinder.py -f - -r 'A\w\wGn'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
52 mychr 0 5 mychr_0_5_for 5 + ACTGn
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
53 mychr 5 10 mychr_5_10_for 5 + ACTGn
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
54 mychr 10 15 mychr_10_15_for 5 + ACTGn
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
55
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
56 DOWNLOAD
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
57 fastaRegexFinder.py is hosted at https://github.com/dariober/bioinformatics-cafe/tree/master/fastaRegexFinder
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
58
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
59 """, formatter_class= argparse.RawTextHelpFormatter)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
60
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
61 parser.add_argument('--fasta', '-f',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
62 type= str,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
63 help='''Input fasta file to search. Use '-' to read the file from stdin.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
64
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
65 ''',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
66 required= True)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
67
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
68 parser.add_argument('--regex', '-r',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
69 type= str,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
70 help='''Regex to be searched in the fasta input.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
71 Matches to the reverse complement will have - strand.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
72 The default regex is '([gG]{3,}\w{1,7}){3,}[gG]{3,}' which searches
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
73 for G-quadruplexes.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
74 ''',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
75 default= '([gG]{3,}\w{1,7}){3,}[gG]{3,}')
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
76
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
77 parser.add_argument('--matchcase', '-m',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
78 action= 'store_true',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
79 help='''Match case while searching for matches. Default is
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
80 to ignore case (I.e. 'ACTG' will match 'actg').
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
81 ''')
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
82
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
83 parser.add_argument('--noreverse',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
84 action= 'store_true',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
85 help='''Do not search the reverse complement of the input fasta.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
86 Use this flag to search protein sequences.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
87 ''')
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
88
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
89 parser.add_argument('--maxstr',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
90 type= int,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
91 required= False,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
92 default= 10000,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
93 help='''Maximum length of the match to report in the 7th column of the output.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
94 Default is to report up to 10000nt.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
95 Truncated matches are reported as <ACTG...ACTG>[<maxstr>,<tot length>]
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
96 ''')
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
97
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
98 parser.add_argument('--seqnames', '-s',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
99 type= str,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
100 nargs= '+',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
101 default= [None],
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
102 required= False,
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
103 help='''List of fasta sequences in --fasta to
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
104 search. E.g. use --seqnames chr1 chr2 chrM to search only these crhomosomes.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
105 Default is to search all the sequences in input.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
106 ''')
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
107 parser.add_argument('--quiet', '-q',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
108 action= 'store_true',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
109 help='''Do not print progress report (i.e. sequence names as they are scanned).
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
110 ''')
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
111
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
112
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
113
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
114 parser.add_argument('--version', '-v', action='version', version='%(prog)s ' + VERSION)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
115
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
116
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
117 args = parser.parse_args()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
118
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
119 " --------------------------[ Check and parse arguments ]---------------------- "
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
120
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
121 if args.matchcase:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
122 flag= 0
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
123 else:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
124 flag= re.IGNORECASE
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
125
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
126 " ------------------------------[ Functions ]--------------------------------- "
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
127
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
128 def sort_table(table, cols):
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
129 """ Code to sort list of lists
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
130 see http://www.saltycrane.com/blog/2007/12/how-to-sort-table-by-columns-in-python/
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
131
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
132 sort a table by multiple columns
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
133 table: a list of lists (or tuple of tuples) where each inner list
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
134 represents a row
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
135 cols: a list (or tuple) specifying the column numbers to sort by
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
136 e.g. (1,0) would sort by column 1, then by column 0
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
137 """
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
138 for col in reversed(cols):
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
139 table = sorted(table, key=operator.itemgetter(col))
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
140 return(table)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
141
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
142 def trimMatch(x, n):
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
143 """ Trim the string x to be at most length n. Trimmed matches will be reported
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
144 with the syntax ACTG[a,b] where Ns are the beginning of x, a is the length of
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
145 the trimmed strng (e.g 4 here) and b is the full length of the match
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
146 EXAMPLE:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
147 trimMatch('ACTGNNNN', 4)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
148 >>>'ACTG[4,8]'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
149 trimMatch('ACTGNNNN', 8)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
150 >>>'ACTGNNNN'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
151 """
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
152 if len(x) > n and n is not None:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
153 m= x[0:n] + '[' + str(n) + ',' + str(len(x)) + ']'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
154 else:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
155 m= x
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
156 return(m)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
157
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
158 def revcomp(x):
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
159 """Reverse complement string x. Ambiguity codes are handled and case conserved.
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
160
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
161 Test
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
162 x= 'ACGTRYSWKMBDHVNacgtryswkmbdhvn'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
163 revcomp(x)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
164 """
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
165 compdict= {'A':'T',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
166 'C':'G',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
167 'G':'C',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
168 'T':'A',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
169 'R':'Y',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
170 'Y':'R',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
171 'S':'W',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
172 'W':'S',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
173 'K':'M',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
174 'M':'K',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
175 'B':'V',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
176 'D':'H',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
177 'H':'D',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
178 'V':'B',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
179 'N':'N',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
180 'a':'t',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
181 'c':'g',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
182 'g':'c',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
183 't':'a',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
184 'r':'y',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
185 'y':'r',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
186 's':'w',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
187 'w':'s',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
188 'k':'m',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
189 'm':'k',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
190 'b':'v',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
191 'd':'h',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
192 'h':'d',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
193 'v':'b',
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
194 'n':'n'}
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
195 xrc= []
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
196 for n in x:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
197 xrc.append(compdict[n])
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
198 xrc= ''.join(xrc)[::-1]
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
199 return(xrc)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
200 # -----------------------------------------------------------------------------
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
201
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
202 psq_re_f= re.compile(args.regex, flags= flag)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
203 ## psq_re_r= re.compile(regexrev)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
204
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
205 if args.fasta != '-':
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
206 ref_seq_fh= open(args.fasta)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
207 else:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
208 ref_seq_fh= sys.stdin
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
209
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
210 ref_seq=[]
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
211 line= (ref_seq_fh.readline()).strip()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
212 chr= re.sub('^>', '', line)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
213 line= (ref_seq_fh.readline()).strip()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
214 gquad_list= []
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
215 while True:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
216 if not args.quiet:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
217 sys.stderr.write('Processing %s\n' %(chr))
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
218 while line.startswith('>') is False:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
219 ref_seq.append(line)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
220 line= (ref_seq_fh.readline()).strip()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
221 if line == '':
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
222 break
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
223 ref_seq= ''.join(ref_seq)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
224 if args.seqnames == [None] or chr in args.seqnames:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
225 for m in re.finditer(psq_re_f, ref_seq):
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
226 matchstr= trimMatch(m.group(0), args.maxstr)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
227 quad_id= str(chr) + '_' + str(m.start()) + '_' + str(m.end()) + '_for'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
228 gquad_list.append([chr, m.start(), m.end(), quad_id, len(m.group(0)), '+', matchstr])
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
229 if args.noreverse is False:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
230 ref_seq= revcomp(ref_seq)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
231 seqlen= len(ref_seq)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
232 for m in re.finditer(psq_re_f, ref_seq):
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
233 matchstr= trimMatch(revcomp(m.group(0)), args.maxstr)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
234 mstart= seqlen - m.end()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
235 mend= seqlen - m.start()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
236 quad_id= str(chr) + '_' + str(mstart) + '_' + str(mend) + '_rev'
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
237 gquad_list.append([chr, mstart, mend, quad_id, len(m.group(0)), '-', matchstr])
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
238 gquad_sorted= sort_table(gquad_list, (1,2,3))
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
239 gquad_list= []
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
240 for xline in gquad_sorted:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
241 xline= '\t'.join([str(x) for x in xline])
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
242 print(xline)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
243 chr= re.sub('^>', '', line)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
244 ref_seq= []
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
245 line= (ref_seq_fh.readline()).strip()
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
246 if line == '':
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
247 break
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
248
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
249 #gquad_sorted= sort_table(gquad_list, (0,1,2,3))
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
250 #
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
251 #for line in gquad_sorted:
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
252 # line= '\t'.join([str(x) for x in line])
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
253 # print(line)
269c627ae9f4 planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/fasta_regex_finder commit 8e118a4d24047e2c62912b962e854f789d6ff559
mbernt
parents:
diff changeset
254 sys.exit()