annotate mothur/lib/galaxy/datatypes/metagenomics.py @ 34:1be61ceb20d7

Updated tool_dependencies.xml to build mothur package on Linux (may break other OSes).
author pjbriggs
date Mon, 22 Sep 2014 11:19:09 -0400
parents ec8df51e841a
children 95d75b35e4d2
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
2 metagenomics datatypes
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
3 James E Johnson - University of Minnesota
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
4 for Mothur
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
5 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
6
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
7 import logging, os, os.path, sys, time, tempfile, shutil, string, glob, re
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
8 import galaxy.model
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
9 from galaxy.datatypes.sniff import *
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
10 from galaxy.datatypes.metadata import MetadataElement
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
11 from galaxy.datatypes.data import Text
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
12 from galaxy.datatypes.tabular import Tabular
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
13 from galaxy.datatypes.sequence import Fasta
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
14 from galaxy import util
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
15 from galaxy.datatypes.images import Html
26
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
16 import pkg_resources
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
17 pkg_resources.require("simplejson")
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
18 import simplejson
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
19
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
20 log = logging.getLogger(__name__)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
21
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
22 ## Mothur Classes
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
23
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
24 class Otu( Text ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
25 file_ext = 'otu'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
26 MetadataElement( name="columns", default=0, desc="Number of columns", readonly=True, visible=True, no_value=0 )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
27 MetadataElement( name="labels", default=[], desc="Label Names", readonly=True, visible=True, no_value=[] )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
28 def __init__(self, **kwd):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
29 Text.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
30 def set_meta( self, dataset, overwrite = True, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
31 if dataset.has_data():
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
32 label_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
33 ncols = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
34 data_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
35 comment_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
36 try:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
37 fh = open( dataset.file_name )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
38 for line in fh:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
39 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
40 if len(fields) >= 2:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
41 data_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
42 ncols = max(ncols,len(fields))
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
43 label_names.add(fields[0])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
44 else:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
45 comment_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
46 # Set the discovered metadata values for the dataset
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
47 dataset.metadata.data_lines = data_lines
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
48 dataset.metadata.columns = ncols
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
49 dataset.metadata.labels = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
50 dataset.metadata.labels += label_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
51 dataset.metadata.labels.sort()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
52 finally:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
53 fh.close()
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
54
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
55 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
56 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
57 Determines whether the file is a otu (operational taxonomic unit) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
58 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
59 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
60 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
61 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
62 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
63 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
64 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
65 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
66 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
67 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
68 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
69 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
70 if len(linePieces) < 2:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
71 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
72 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
73 check = int(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
74 if check + 2 != len(linePieces):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
75 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
76 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
77 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
78 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
79 if count == 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
80 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
81 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
82 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
83 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
84 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
85 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
86 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
87 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
88 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
89
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
90 class OtuList( Otu ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
91 file_ext = 'list'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
92 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
93 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
94 # http://www.mothur.org/wiki/List_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
95 The first column is a label that represents the distance that sequences were assigned to OTUs.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
96 The number in the second column is the number of OTUs that have been formed.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
97 Subsequent columns contain the names of sequences within each OTU separated by a comma.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
98 distance_label otu_count OTU1 OTU2 OTUn
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
99 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
100 Otu.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
101 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
102 Otu.init_meta( self, dataset, copy_from=copy_from )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
103 def set_meta( self, dataset, overwrite = True, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
104 Otu.set_meta(self,dataset, overwrite = True, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
105 """
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
106 # too many columns to be stored in metadata
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
107 if dataset != None and dataset.metadata.columns > 2:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
108 for i in range(2,dataset.metadata.columns):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
109 dataset.metadata.column_types[i] = 'str'
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
110 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
111
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
112 class Sabund( Otu ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
113 file_ext = 'sabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
114 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
115 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
116 # http://www.mothur.org/wiki/Sabund_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
117 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
118 Otu.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
119 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
120 Otu.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
121 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
122 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
123 Determines whether the file is a otu (operational taxonomic unit) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
124 label<TAB>count[<TAB>value(1..n)]
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
125
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
126 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
127 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
128 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
129 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
130 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
131 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
132 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
133 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
134 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
135 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
136 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
137 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
138 if len(linePieces) < 2:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
139 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
140 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
141 check = int(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
142 if check + 2 != len(linePieces):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
143 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
144 for i in range( 2, len(linePieces)):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
145 ival = int(linePieces[i])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
146 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
147 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
148 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
149 if count >= 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
150 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
151 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
152 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
153 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
154 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
155 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
156 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
157 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
158 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
159
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
160 class Rabund( Sabund ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
161 file_ext = 'rabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
162 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
163 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
164 # http://www.mothur.org/wiki/Rabund_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
165 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
166 Sabund.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
167 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
168 Sabund.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
169
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
170 class GroupAbund( Otu ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
171 file_ext = 'grpabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
172 MetadataElement( name="groups", default=[], desc="Group Names", readonly=True, visible=True, no_value=[] )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
173 def __init__(self, **kwd):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
174 Otu.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
175 # self.column_names[0] = ['label']
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
176 # self.column_names[1] = ['group']
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
177 # self.column_names[2] = ['count']
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
178 """
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
179 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
180 Otu.init_meta( self, dataset, copy_from=copy_from )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
181 """
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
182 def init_meta( self, dataset, copy_from=None ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
183 Otu.init_meta( self, dataset, copy_from=copy_from )
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
184 def set_meta( self, dataset, overwrite = True, skip=1, max_data_lines = 100000, **kwd ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
185 # See if file starts with header line
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
186 if dataset.has_data():
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
187 label_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
188 group_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
189 data_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
190 comment_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
191 ncols = 0
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
192 try:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
193 fh = open( dataset.file_name )
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
194 line = fh.readline()
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
195 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
196 ncols = max(ncols,len(fields))
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
197 if fields[0] == 'label' and fields[1] == 'Group':
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
198 skip=1
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
199 comment_lines += 1
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
200 else:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
201 skip=0
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
202 data_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
203 label_names.add(fields[0])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
204 group_names.add(fields[1])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
205 for line in fh:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
206 data_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
207 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
208 ncols = max(ncols,len(fields))
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
209 label_names.add(fields[0])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
210 group_names.add(fields[1])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
211 # Set the discovered metadata values for the dataset
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
212 dataset.metadata.data_lines = data_lines
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
213 dataset.metadata.columns = ncols
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
214 dataset.metadata.labels = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
215 dataset.metadata.labels += label_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
216 dataset.metadata.labels.sort()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
217 dataset.metadata.groups = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
218 dataset.metadata.groups += group_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
219 dataset.metadata.groups.sort()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
220 dataset.metadata.skip = skip
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
221 finally:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
222 fh.close()
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
223
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
224 def sniff( self, filename, vals_are_int=False):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
225 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
226 Determines whether the file is a otu (operational taxonomic unit) Shared format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
227 label<TAB>group<TAB>count[<TAB>value(1..n)]
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
228 The first line is column headings as of Mothur v 1.20
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
229 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
230 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
231 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
232 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
233 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
234 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
235 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
236 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
237 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
238 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
239 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
240 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
241 if len(linePieces) < 3:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
242 return False
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
243 if count > 0 or linePieces[0] != 'label':
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
244 try:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
245 check = int(linePieces[2])
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
246 if check + 3 != len(linePieces):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
247 return False
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
248 for i in range( 3, len(linePieces)):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
249 if vals_are_int:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
250 ival = int(linePieces[i])
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
251 else:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
252 fval = float(linePieces[i])
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
253 except ValueError:
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
254 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
255 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
256 if count >= 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
257 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
258 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
259 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
260 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
261 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
262 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
263 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
264 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
265 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
266
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
267 class SharedRabund( GroupAbund ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
268 file_ext = 'shared'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
269 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
270 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
271 # http://www.mothur.org/wiki/Shared_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
272 A shared file is analogous to an rabund file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
273 The data in a shared file represent the number of times that an OTU is observed in multiple samples.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
274 The structure of a shared file is analogous to an rabund file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
275 The first column contains the label for the comparison - this will be the value for the first column of each line from the original list file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
276 The second column contains the group name that designates where the data is coming from for that row.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
277 The third column is the number of OTUs that were found between each of the groups and is the number of columns that follow.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
278 Finally, the remaining columns indicate the number of sequences that belonged to each OTU from that group.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
279 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
280 GroupAbund.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
281 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
282 GroupAbund.init_meta( self, dataset, copy_from=copy_from )
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
283 def sniff( self, filename ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
284 """
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
285 Determines whether the file is a otu (operational taxonomic unit) Shared format
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
286 label<TAB>group<TAB>count[<TAB>value(1..n)]
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
287 The first line is column headings as of Mothur v 1.20
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
288 """
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
289 # return GroupAbund.sniff(self,filename,True)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
290 isme = GroupAbund.sniff(self,filename,True)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
291 return isme
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
292
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
293
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
294 class RelAbund( GroupAbund ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
295 file_ext = 'relabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
296 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
297 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
298 # http://www.mothur.org/wiki/Relabund_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
299 The structure of a relabund file is analogous to an shared file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
300 The first column contains the label for the comparison - this will be the value for the first column of each line from the original list file (e.g. final.an.list).
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
301 The second column contains the group name that designates where the data is coming from for that row. Next is the number of OTUs that were found between each of the groups and is the number of columns that follow.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
302 Finally, the remaining columns indicate the relative abundance of each OTU from that group.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
303 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
304 GroupAbund.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
305 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
306 GroupAbund.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
307 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
308 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
309 Determines whether the file is a otu (operational taxonomic unit) Relative Abundance format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
310 label<TAB>group<TAB>count[<TAB>value(1..n)]
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
311 The first line is column headings as of Mothur v 1.20
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
312 """
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
313 # return GroupAbund.sniff(self,filename,False)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
314 isme = GroupAbund.sniff(self,filename,False)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
315 return isme
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
316
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
317 class SecondaryStructureMap(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
318 file_ext = 'map'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
319 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
320 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
321 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
322 self.column_names = ['Map']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
323
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
324 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
325 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
326 Determines whether the file is a secondary structure map format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
327 A single column with an integer value which indicates the row that this row maps to.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
328 check you make sure is structMap[10] = 380 then structMap[380] = 10.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
329 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
330 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
331 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
332 line_num = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
333 rowidxmap = {}
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
334 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
335 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
336 line_num += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
337 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
338 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
339 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
340 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
341 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
342 pointer = int(line)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
343 if pointer > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
344 if pointer > line_num:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
345 rowidxmap[line_num] = pointer
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
346 elif pointer < line_num & rowidxmap[pointer] != line_num:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
347 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
348 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
349 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
350 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
351 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
352 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
353 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
354 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
355 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
356 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
357 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
358
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
359 class SequenceAlignment( Fasta ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
360 file_ext = 'align'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
361 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
362 Fasta.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
363 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
364
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
365 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
366 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
367 Determines whether the file is in Mothur align fasta format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
368 Each sequence line must be the same length
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
369 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
370
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
371 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
372 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
373 len = -1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
374 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
375 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
376 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
377 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
378 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
379 if line: #first non-empty line
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
380 if line.startswith( '>' ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
381 #The next line.strip() must not be '', nor startwith '>'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
382 line = fh.readline().strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
383 if line == '' or line.startswith( '>' ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
384 break
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
385 if len < 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
386 len = len(line)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
387 elif len != len(line):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
388 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
389 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
390 break #we found a non-empty line, but its not a fasta header
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
391 if len > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
392 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
393 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
394 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
395 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
396 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
397 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
398
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
399 class AlignCheck( Tabular ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
400 file_ext = 'align.check'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
401 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
402 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
403 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
404 self.column_names = ['name','pound','dash','plus','equal','loop','tilde','total']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
405 self.column_types = ['str','int','int','int','int','int','int','int']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
406 self.comment_lines = 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
407
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
408 def set_meta( self, dataset, overwrite = True, **kwd ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
409 # Tabular.set_meta( self, dataset, overwrite = overwrite, first_line_is_header = True, skip = 1 )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
410 data_lines = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
411 if dataset.has_data():
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
412 dataset_fh = open( dataset.file_name )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
413 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
414 line = dataset_fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
415 if not line: break
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
416 data_lines += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
417 dataset_fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
418 dataset.metadata.comment_lines = 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
419 dataset.metadata.data_lines = data_lines - 1 if data_lines > 0 else 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
420 dataset.metadata.column_names = self.column_names
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
421 dataset.metadata.column_types = self.column_types
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
422
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
423 class AlignReport(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
424 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
425 QueryName QueryLength TemplateName TemplateLength SearchMethod SearchScore AlignmentMethod QueryStart QueryEnd TemplateStart TemplateEnd PairwiseAlignmentLength GapsInQuery GapsInTemplate LongestInsert SimBtwnQuery&Template
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
426 AY457915 501 82283 1525 kmer 89.07 needleman 5 501 1 499 499 2 0 0 97.6
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
427 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
428 file_ext = 'align.report'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
429 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
430 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
431 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
432 self.column_names = ['QueryName','QueryLength','TemplateName','TemplateLength','SearchMethod','SearchScore',
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
433 'AlignmentMethod','QueryStart','QueryEnd','TemplateStart','TemplateEnd',
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
434 'PairwiseAlignmentLength','GapsInQuery','GapsInTemplate','LongestInsert','SimBtwnQuery&Template'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
435 ]
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
436
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
437 class BellerophonChimera( Tabular ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
438 file_ext = 'bellerophon.chimera'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
439 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
440 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
441 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
442 self.column_names = ['Name','Score','Left','Right']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
443
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
444 class SecondaryStructureMatch(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
445 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
446 name pound dash plus equal loop tilde total
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
447 9_1_12 42 68 8 28 275 420 872
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
448 9_1_14 36 68 6 26 266 422 851
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
449 9_1_15 44 68 8 28 276 418 873
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
450 9_1_16 34 72 6 30 267 430 860
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
451 9_1_18 46 80 2 36 261
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
452 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
453 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
454 """Initialize SecondaryStructureMatch datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
455 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
456 self.column_names = ['name','pound','dash','plus','equal','loop','tilde','total']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
457
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
458 class DistanceMatrix( Text ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
459 file_ext = 'dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
460 """Add metadata elements"""
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
461 MetadataElement( name="sequence_count", default=0, desc="Number of sequences", readonly=True, visible=True, optional=True, no_value='?' )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
462
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
463 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
464 Text.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
465
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
466 def set_meta( self, dataset, overwrite = True, skip = 0, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
467 Text.set_meta(self, dataset,overwrite = overwrite, skip = skip, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
468 try:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
469 fh = open( dataset.file_name )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
470 line = fh.readline().strip().strip()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
471 dataset.metadata.sequence_count = int(line)
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
472 except Exception, e:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
473 log.warn("DistanceMatrix set_meta %s" % e)
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
474 finally:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
475 fh.close()
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
476
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
477 class LowerTriangleDistanceMatrix(DistanceMatrix):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
478 file_ext = 'lower.dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
479 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
480 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
481 DistanceMatrix.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
482
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
483 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
484 DistanceMatrix.init_meta( self, dataset, copy_from=copy_from )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
485
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
486 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
487 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
488 Determines whether the file is a lower-triangle distance matrix (phylip) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
489 The first line has the number of sequences in the matrix.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
490 The remaining lines have the sequence name followed by a list of distances from all preceeding sequences
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
491 5
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
492 U68589
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
493 U68590 0.3371
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
494 U68591 0.3609 0.3782
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
495 U68592 0.4155 0.3197 0.4148
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
496 U68593 0.2872 0.1690 0.3361 0.2842
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
497 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
498 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
499 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
500 count = 0
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
501 line = fh.readline()
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
502 sequence_count = int(line.strip())
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
503 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
504 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
505 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
506 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
507 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
508 if line:
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
509 # Split into fields
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
510 linePieces = line.split('\t')
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
511 # Each line should have the same number of
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
512 # fields as the Python line index
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
513 linePieces = line.split('\t')
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
514 if len(linePieces) != (count + 1):
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
515 return False
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
516 # Distances should be floats
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
517 try:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
518 for linePiece in linePieces[2:]:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
519 check = float(linePiece)
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
520 except ValueError:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
521 return False
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
522 # Increment line counter
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
523 count += 1
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
524 # Only check first 5 lines
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
525 if count == 5:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
526 return True
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
527 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
528 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
529 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
530 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
531 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
532 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
533 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
534 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
535
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
536 class SquareDistanceMatrix(DistanceMatrix):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
537 file_ext = 'square.dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
538
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
539 def __init__(self, **kwd):
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
540 DistanceMatrix.__init__( self, **kwd )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
541 def init_meta( self, dataset, copy_from=None ):
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
542 DistanceMatrix.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
543
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
544 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
545 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
546 Determines whether the file is a square distance matrix (Column-formatted distance matrix) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
547 The first line has the number of sequences in the matrix.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
548 The following lines have the sequence name in the first column plus a column for the distance to each sequence
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
549 in the row order in which they appear in the matrix.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
550 3
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
551 U68589 0.0000 0.3371 0.3610
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
552 U68590 0.3371 0.0000 0.3783
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
553 U68590 0.3371 0.0000 0.3783
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
554 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
555 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
556 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
557 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
558 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
559 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
560 sequence_count = int(line)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
561 col_cnt = seq_cnt + 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
562 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
563 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
564 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
565 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
566 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
567 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
568 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
569 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
570 if len(linePieces) != col_cnt :
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
571 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
572 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
573 for i in range(1, col_cnt):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
574 check = float(linePieces[i])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
575 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
576 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
577 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
578 if count == 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
579 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
580 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
581 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
582 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
583 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
584 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
585 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
586 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
587 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
588
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
589 class PairwiseDistanceMatrix(DistanceMatrix,Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
590 file_ext = 'pair.dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
591 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
592 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
593 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
594 self.column_names = ['Sequence','Sequence','Distance']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
595 self.column_types = ['str','str','float']
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
596 def set_meta( self, dataset, overwrite = True, skip = None, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
597 Tabular.set_meta(self, dataset,overwrite = overwrite, skip = skip, **kwd )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
598
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
599 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
600 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
601 Determines whether the file is a pairwise distance matrix (Column-formatted distance matrix) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
602 The first and second columns have the sequence names and the third column is the distance between those sequences.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
603 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
604 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
605 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
606 count = 0
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
607 all_ints = True
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
608 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
609 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
610 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
611 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
612 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
613 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
614 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
615 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
616 if len(linePieces) != 3:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
617 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
618 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
619 check = float(linePieces[2])
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
620 try:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
621 # See if it's also an integer
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
622 check_int = int(linePieces[2])
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
623 except ValueError:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
624 # At least one value is not an
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
625 # integer
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
626 all_ints = False
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
627 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
628 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
629 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
630 if count == 5:
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
631 if not all_ints:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
632 return True
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
633 else:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
634 return False
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
635 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
636 if count < 5 and count > 0:
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
637 if not all_ints:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
638 return True
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
639 else:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
640 return False
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
641 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
642 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
643 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
644 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
645 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
646
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
647 class AlignCheck(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
648 file_ext = 'align.check'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
649 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
650 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
651 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
652 self.column_names = ['name','pound','dash','plus','equal','loop','tilde','total']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
653 self.columns = 8
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
654
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
655 class Names(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
656 file_ext = 'names'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
657 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
658 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
659 # http://www.mothur.org/wiki/Name_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
660 Name file shows the relationship between a representative sequence(col 1) and the sequences(comma-separated) it represents(col 2)
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
661 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
662 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
663 self.column_names = ['name','representatives']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
664 self.columns = 2
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
665
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
666 class Summary(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
667 file_ext = 'summary'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
668 def __init__(self, **kwd):
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
669 """summarizes the quality of sequences in an unaligned or aligned fasta-formatted sequence file"""
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
670 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
671 self.column_names = ['seqname','start','end','nbases','ambigs','polymer']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
672 self.columns = 6
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
673
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
674 class Group(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
675 file_ext = 'groups'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
676 MetadataElement( name="groups", default=[], desc="Group Names", readonly=True, visible=True, no_value=[] )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
677 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
678 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
679 # http://www.mothur.org/wiki/Groups_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
680 Group file assigns sequence (col 1) to a group (col 2)
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
681 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
682 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
683 self.column_names = ['name','group']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
684 self.columns = 2
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
685 def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
686 Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines)
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
687 group_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
688 try:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
689 fh = open( dataset.file_name )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
690 for line in fh:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
691 fields = line.strip().split('\t')
32
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
692 try:
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
693 group_names.add(fields[1])
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
694 except IndexError:
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
695 # Ignore missing 2nd column
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
696 pass
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
697 dataset.metadata.groups = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
698 dataset.metadata.groups += group_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
699 finally:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
700 fh.close()
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
701
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
702 class Design(Group):
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
703 file_ext = 'design'
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
704 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
705 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
706 # http://www.mothur.org/wiki/Design_File
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
707 Design file shows the relationship between a group(col 1) and a grouping (col 2), providing a way to merge groups.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
708 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
709 Group.__init__( self, **kwd )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
710
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
711 class AccNos(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
712 file_ext = 'accnos'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
713 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
714 """A list of names"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
715 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
716 self.column_names = ['name']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
717 self.columns = 1
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
718
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
719 class Oligos( Text ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
720 file_ext = 'oligos'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
721
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
722 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
723 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
724 # http://www.mothur.org/wiki/Oligos_File
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
725 Determines whether the file is a otu (operational taxonomic unit) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
726 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
727 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
728 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
729 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
730 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
731 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
732 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
733 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
734 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
735 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
736 if line[0] != '#':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
737 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
738 if len(linePieces) == 2 and re.match('forward|reverse',linePieces[0]):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
739 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
740 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
741 elif len(linePieces) == 3 and re.match('barcode',linePieces[0]):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
742 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
743 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
744 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
745 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
746 if count > 20:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
747 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
748 if count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
749 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
750 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
751 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
752 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
753 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
754 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
755
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
756 class Frequency(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
757 file_ext = 'freq'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
758 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
759 """A list of names"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
760 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
761 self.column_names = ['position','frequency']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
762 self.column_types = ['int','float']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
763
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
764 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
765 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
766 Determines whether the file is a frequency tabular format for chimera analysis
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
767 #1.14.0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
768 0 0.000
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
769 1 0.000
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
770 ...
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
771 155 0.975
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
772 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
773 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
774 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
775 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
776 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
777 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
778 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
779 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
780 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
781 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
782 if line[0] != '#':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
783 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
784 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
785 i = int(linePieces[0])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
786 f = float(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
787 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
788 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
789 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
790 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
791 if count > 20:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
792 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
793 if count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
794 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
795 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
796 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
797 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
798 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
799 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
800
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
801 class Quantile(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
802 file_ext = 'quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
803 MetadataElement( name="filtered", default=False, no_value=False, optional=True , desc="Quantiles calculated using a mask", readonly=True)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
804 MetadataElement( name="masked", default=False, no_value=False, optional=True , desc="Quantiles calculated using a frequency filter", readonly=True)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
805 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
806 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
807 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
808 self.column_names = ['num','ten','twentyfive','fifty','seventyfive','ninetyfive','ninetynine']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
809 self.column_types = ['int','float','float','float','float','float','float']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
810 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
811 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
812 Determines whether the file is a quantiles tabular format for chimera analysis
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
813 1 0 0 0 0 0 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
814 2 0.309198 0.309198 0.37161 0.37161 0.37161 0.37161
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
815 3 0.510982 0.563213 0.693529 0.858939 1.07442 1.20608
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
816 ...
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
817 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
818 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
819 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
820 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
821 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
822 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
823 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
824 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
825 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
826 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
827 if line[0] != '#':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
828 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
829 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
830 i = int(linePieces[0])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
831 f = float(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
832 f = float(linePieces[2])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
833 f = float(linePieces[3])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
834 f = float(linePieces[4])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
835 f = float(linePieces[5])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
836 f = float(linePieces[6])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
837 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
838 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
839 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
840 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
841 if count > 10:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
842 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
843 if count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
844 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
845 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
846 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
847 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
848 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
849 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
850
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
851 class FilteredQuantile(Quantile):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
852 file_ext = 'filtered.quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
853 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
854 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
855 Quantile.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
856 self.filtered = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
857
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
858 class MaskedQuantile(Quantile):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
859 file_ext = 'masked.quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
860 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
861 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
862 Quantile.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
863 self.masked = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
864 self.filtered = False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
865
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
866 class FilteredMaskedQuantile(Quantile):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
867 file_ext = 'filtered.masked.quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
868 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
869 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
870 Quantile.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
871 self.masked = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
872 self.filtered = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
873
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
874 class LaneMask(Text):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
875 file_ext = 'filter'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
876
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
877 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
878 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
879 Determines whether the file is a lane mask filter: 1 line consisting of zeros and ones.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
880 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
881 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
882 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
883 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
884 buff = fh.read(1000)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
885 if not buff:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
886 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
887 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
888 if not re.match('^[01]+$',line):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
889 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
890 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
891 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
892 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
893 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
894 close(fh)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
895 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
896
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
897 class CountTable(Tabular):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
898 MetadataElement( name="groups", default=[], desc="Group Names", readonly=True, visible=True, no_value=[] )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
899 file_ext = 'count_table'
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
900
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
901 def __init__(self, **kwd):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
902 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
903 # http://www.mothur.org/wiki/Count_File
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
904 A table with first column names and following columns integer counts
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
905 # Example 1:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
906 Representative_Sequence total
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
907 U68630 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
908 U68595 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
909 U68600 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
910 # Example 2 (with group columns):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
911 Representative_Sequence total forest pasture
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
912 U68630 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
913 U68595 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
914 U68600 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
915 U68591 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
916 U68647 1 0 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
917 """
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
918 Tabular.__init__( self, **kwd )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
919 self.column_names = ['name','total']
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
920
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
921 def set_meta( self, dataset, overwrite = True, skip = 1, max_data_lines = None, **kwd ):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
922 try:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
923 data_lines = 0;
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
924 fh = open( dataset.file_name )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
925 line = fh.readline()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
926 if line:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
927 line = line.strip()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
928 colnames = line.split()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
929 if len(colnames) > 1:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
930 dataset.metadata.columns = len( colnames )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
931 if len(colnames) > 2:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
932 dataset.metadata.groups = colnames[2:]
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
933 column_types = ['str']
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
934 for i in range(1,len(colnames)):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
935 column_types.append('int')
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
936 dataset.metadata.column_types = column_types
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
937 dataset.metadata.comment_lines = 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
938 while line:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
939 line = fh.readline()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
940 if not line: break
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
941 data_lines += 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
942 dataset.metadata.data_lines = data_lines
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
943 finally:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
944 close(fh)
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
945
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
946 class RefTaxonomy(Tabular):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
947 file_ext = 'ref.taxonomy'
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
948 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
949 # http://www.mothur.org/wiki/Taxonomy_outline
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
950 A table with 2 or 3 columns:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
951 - SequenceName
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
952 - Taxonomy (semicolon-separated taxonomy in descending order)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
953 - integer ?
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
954 Example: 2-column ( http://www.mothur.org/wiki/Taxonomy_outline )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
955 X56533.1 Eukaryota;Alveolata;Ciliophora;Intramacronucleata;Oligohymenophorea;Hymenostomatida;Tetrahymenina;Glaucomidae;Glaucoma;
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
956 X97975.1 Eukaryota;Parabasalidea;Trichomonada;Trichomonadida;unclassified_Trichomonadida;
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
957 AF052717.1 Eukaryota;Parabasalidea;
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
958 Example: 3-column ( http://vamps.mbl.edu/resources/databases.php )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
959 v3_AA008 Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus 5
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
960 v3_AA016 Bacteria 120
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
961 v3_AA019 Archaea;Crenarchaeota;Marine_Group_I 1
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
962 """
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
963 def __init__(self, **kwd):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
964 Tabular.__init__( self, **kwd )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
965 self.column_names = ['name','taxonomy']
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
966
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
967 def sniff( self, filename ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
968 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
969 Determines whether the file is a Reference Taxonomy
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
970 """
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
971 try:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
972 pat = '^([^ \t\n\r\x0c\x0b;]+([(]\\d+[)])?(;[^ \t\n\r\x0c\x0b;]+([(]\\d+[)])?)*(;)?)$'
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
973 fh = open( filename )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
974 count = 0
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
975 # VAMPS taxonomy files do not require a semicolon after the last taxonomy category
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
976 # but assume assume the file will have some multi-level taxonomy assignments
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
977 found_semicolons = False
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
978 while True:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
979 line = fh.readline()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
980 if not line:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
981 break #EOF
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
982 line = line.strip()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
983 if line:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
984 fields = line.split('\t')
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
985 if not (2 <= len(fields) <= 3):
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
986 return False
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
987 if not re.match(pat,fields[1]):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
988 return False
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
989 if not found_semicolons and str(fields[1]).count(';') > 0:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
990 found_semicolons = True
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
991 if len(fields) == 3:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
992 check = int(fields[2])
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
993 count += 1
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
994 if count > 100:
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
995 break
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
996 if count > 0:
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
997 # This will be true if at least one entry
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
998 # has semicolons in the 2nd column
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
999 return found_semicolons
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1000 except:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1001 pass
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1002 finally:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1003 fh.close()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1004 return False
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1005
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1006 class SequenceTaxonomy(RefTaxonomy):
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1007 file_ext = 'seq.taxonomy'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1008 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1009 # http://www.mothur.org/wiki/Taxonomy_outline
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1010 A table with 2 columns:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1011 - SequenceName
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1012 - Taxonomy (semicolon-separated taxonomy in descending order)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1013 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1014 X56533.1 Eukaryota;Alveolata;Ciliophora;Intramacronucleata;Oligohymenophorea;Hymenostomatida;Tetrahymenina;Glaucomidae;Glaucoma;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1015 X97975.1 Eukaryota;Parabasalidea;Trichomonada;Trichomonadida;unclassified_Trichomonadida;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1016 AF052717.1 Eukaryota;Parabasalidea;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1017 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1018 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1019 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1020 self.column_names = ['name','taxonomy']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1021
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1022 def sniff( self, filename ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1023 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1024 Determines whether the file is a SequenceTaxonomy
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1025 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1026 try:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1027 pat = '^([^ \t\n\r\f\v;]+([(]\d+[)])?[;])+$'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1028 fh = open( filename )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1029 count = 0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1030 while True:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1031 line = fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1032 if not line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1033 break #EOF
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1034 line = line.strip()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1035 if line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1036 fields = line.split('\t')
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1037 if len(fields) != 2:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1038 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1039 if not re.match(pat,fields[1]):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1040 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1041 count += 1
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1042 if count > 10:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1043 break
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1044 if count > 0:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1045 return True
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1046 except:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1047 pass
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1048 finally:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1049 fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1050 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1051
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1052 class RDPSequenceTaxonomy(SequenceTaxonomy):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1053 file_ext = 'rdp.taxonomy'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1054 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1055 A table with 2 columns:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1056 - SequenceName
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1057 - Taxonomy (semicolon-separated taxonomy in descending order, RDP requires exactly 6 levels deep)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1058 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1059 AB001518.1 Bacteria;Bacteroidetes;Sphingobacteria;Sphingobacteriales;unclassified_Sphingobacteriales;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1060 AB001724.1 Bacteria;Cyanobacteria;Cyanobacteria;Family_II;GpIIa;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1061 AB001774.1 Bacteria;Chlamydiae;Chlamydiae;Chlamydiales;Chlamydiaceae;Chlamydophila;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1062 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1063 def sniff( self, filename ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1064 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1065 Determines whether the file is a SequenceTaxonomy
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1066 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1067 try:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1068 pat = '^([^ \t\n\r\f\v;]+([(]\d+[)])?[;]){6}$'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1069 fh = open( filename )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1070 count = 0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1071 while True:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1072 line = fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1073 if not line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1074 break #EOF
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1075 line = line.strip()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1076 if line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1077 fields = line.split('\t')
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1078 if len(fields) != 2:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1079 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1080 if not re.match(pat,fields[1]):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1081 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1082 count += 1
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1083 if count > 10:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1084 break
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1085 if count > 0:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1086 return True
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1087 except:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1088 pass
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1089 finally:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1090 fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1091 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1092
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1093 class ConsensusTaxonomy(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1094 file_ext = 'cons.taxonomy'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1095 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1096 """A list of names"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1097 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1098 self.column_names = ['OTU','count','taxonomy']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1099
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1100 class TaxonomySummary(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1101 file_ext = 'tax.summary'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1102 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1103 """A Summary of taxon classification"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1104 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1105 self.column_names = ['taxlevel','rankID','taxon','daughterlevels','total']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1106
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1107 class Phylip(Text):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1108 file_ext = 'phy'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1109
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1110 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1111 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1112 Determines whether the file is in Phylip format (Interleaved or Sequential)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1113 The first line of the input file contains the number of species and the
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1114 number of characters, in free format, separated by blanks (not by
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1115 commas). The information for each species follows, starting with a
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1116 ten-character species name (which can include punctuation marks and blanks),
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1117 and continuing with the characters for that species.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1118 http://evolution.genetics.washington.edu/phylip/doc/main.html#inputfiles
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1119 Interleaved Example:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1120 6 39
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1121 Archaeopt CGATGCTTAC CGCCGATGCT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1122 HesperorniCGTTACTCGT TGTCGTTACT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1123 BaluchitheTAATGTTAAT TGTTAATGTT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1124 B. virginiTAATGTTCGT TGTTAATGTT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1125 BrontosaurCAAAACCCAT CATCAAAACC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1126 B.subtilisGGCAGCCAAT CACGGCAGCC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1127
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1128 TACCGCCGAT GCTTACCGC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1129 CGTTGTCGTT ACTCGTTGT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1130 AATTGTTAAT GTTAATTGT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1131 CGTTGTTAAT GTTCGTTGT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1132 CATCATCAAA ACCCATCAT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1133 AATCACGGCA GCCAATCAC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1134 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1135 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1136 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1137 # counts line
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1138 line = fh.readline().strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1139 linePieces = line.split()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1140 count = int(linePieces[0])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1141 seq_len = int(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1142 # data lines
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1143 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1144 TODO check data lines
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1145 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1146 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1147 # name is the first 10 characters
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1148 name = line[0:10]
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1149 seq = line[10:].strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1150 # nucleic base or amino acid 1-char designators (spaces allowed)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1151 bases = ''.join(seq.split())
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1152 # float per base (each separated by space)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1153 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1154 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1155 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1156 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1157 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1158 close(fh)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1159 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1160
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1161
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1162 class Axes(Tabular):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1163 file_ext = 'axes'
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1164
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1165 def __init__(self, **kwd):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1166 """Initialize axes datatype"""
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1167 Tabular.__init__( self, **kwd )
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1168 def sniff( self, filename ):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1169 """
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1170 Determines whether the file is an axes format
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1171 The first line may have column headings.
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1172 The following lines have the name in the first column plus float columns for each axis.
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1173 ==> 98_sq_phylip_amazon.fn.unique.pca.axes <==
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1174 group axis1 axis2
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1175 forest 0.000000 0.145743
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1176 pasture 0.145743 0.000000
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1177
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1178 ==> 98_sq_phylip_amazon.nmds.axes <==
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1179 axis1 axis2
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1180 U68589 0.262608 -0.077498
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1181 U68590 0.027118 0.195197
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1182 U68591 0.329854 0.014395
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1183 """
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1184 try:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1185 fh = open( filename )
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1186 count = 0
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1187 line = fh.readline()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1188 line = line.strip()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1189 col_cnt = None
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1190 all_integers = True
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1191 while True:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1192 line = fh.readline()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1193 line = line.strip()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1194 if not line:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1195 break #EOF
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1196 if line:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1197 fields = line.split('\t')
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1198 if col_cnt == None: # ignore values in first line as they may be column headings
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1199 col_cnt = len(fields)
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
1200 # There should be at least 2 columns
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
1201 if col_cnt < 2:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
1202 return False
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1203 else:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1204 if len(fields) != col_cnt :
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1205 return False
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1206 try:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1207 for i in range(1, col_cnt):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1208 check = float(fields[i])
32
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
1209 # Check abs value is <= 1.0
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
1210 if abs(check) > 1.0:
ec8df51e841a Fixes courtesy of Peter Briggs:
Jim Johnson <jj@umn.edu>
parents: 30
diff changeset
1211 return False
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1212 # Also test for whether value is an integer
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1213 try:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1214 check = int(fields[i])
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1215 except ValueError:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1216 all_integers = False
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1217 except ValueError:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1218 return False
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1219 count += 1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1220 if count > 10:
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1221 break
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1222 if count > 0:
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1223 if not all_integers:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1224 # At least one value was a float
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1225 return True
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1226 else:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1227 return False
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1228 except:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1229 pass
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1230 finally:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1231 fh.close()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1232 return False
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1233
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1234 class SffFlow(Tabular):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1235 MetadataElement( name="flow_values", default="", no_value="", optional=True , desc="Total number of flow values", readonly=True)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1236 MetadataElement( name="flow_order", default="TACG", no_value="TACG", desc="Total number of flow values", readonly=False)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1237 file_ext = 'sff.flow'
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1238 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1239 # http://www.mothur.org/wiki/Flow_file
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1240 The first line is the total number of flow values - 800 for Titanium data. For GS FLX it would be 400.
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1241 Following lines contain:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1242 - SequenceName
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1243 - the number of useable flows as defined by 454's software
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1244 - the flow intensity for each base going in the order of TACG.
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1245 Example:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1246 800
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1247 GQY1XT001CQL4K 85 1.04 0.00 1.00 0.02 0.03 1.02 0.05 ...
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1248 GQY1XT001CQIRF 84 1.02 0.06 0.98 0.06 0.09 1.05 0.07 ...
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1249 GQY1XT001CF5YW 88 1.02 0.02 1.01 0.04 0.06 1.02 0.03 ...
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1250 """
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1251 def __init__(self, **kwd):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1252 Tabular.__init__( self, **kwd )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1253
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1254 def set_meta( self, dataset, overwrite = True, skip = 1, max_data_lines = None, **kwd ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1255 Tabular.set_meta(self, dataset, overwrite, 1, max_data_lines)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1256 try:
16
541e3c97c240 Mothur - fix set_meta for SffFlow in datatypes/metagenomics.py
Jim Johnson <jj@umn.edu>
parents: 15
diff changeset
1257 fh = open( dataset.file_name )
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1258 line = fh.readline()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1259 line = line.strip()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1260 flow_values = int(line)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1261 dataset.metadata.flow_values = flow_values
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1262 finally:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1263 fh.close()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1264
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1265 def make_html_table( self, dataset, skipchars=[] ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1266 """Create HTML table, used for displaying peek"""
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1267 out = ['<table cellspacing="0" cellpadding="3">']
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1268 comments = []
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1269 try:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1270 # Generate column header
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1271 out.append('<tr>')
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1272 out.append( '<th>%d. Name</th>' % 1 )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1273 out.append( '<th>%d. Flows</th>' % 2 )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1274 for i in range( 3, dataset.metadata.columns+1 ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1275 base = dataset.metadata.flow_order[(i+1)%4]
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1276 out.append( '<th>%d. %d %s</th>' % (i-2,base) )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1277 out.append('</tr>')
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1278 out.append( self.make_html_peek_rows( dataset, skipchars=skipchars ) )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1279 out.append( '</table>' )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1280 out = "".join( out )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1281 except Exception, exc:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1282 out = "Can't create peek %s" % str( exc )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1283 return out
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1284
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1285 class Newick( Text ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1286 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1287 The Newick Standard for representing trees in computer-readable form makes use of the correspondence between trees and nested parentheses.
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1288 http://evolution.genetics.washington.edu/phylip/newicktree.html
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1289 http://en.wikipedia.org/wiki/Newick_format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1290 Example:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1291 (B,(A,C,E),D);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1292 or example with branch lengths:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1293 (B:6.0,(A:5.0,C:3.0,E:4.0):5.0,D:11.0);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1294 or an example with embedded comments but no branch lengths:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1295 ((a [&&PRIME S=x], b [&&PRIME S=y]), c [&&PRIME S=z]);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1296 Example with named interior noe:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1297 (B:6.0,(A:5.0,C:3.0,E:4.0)Ancestor1:5.0,D:11.0);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1298 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1299 file_ext = 'tre'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1300
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1301 def __init__(self, **kwd):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1302 Text.__init__( self, **kwd )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1303
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1304 def sniff( self, filename ): ## TODO
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1305 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1306 Determine whether the file is in Newick format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1307 Note: Last non-space char of a tree should be a semicolon: ';'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1308 Usually the first char will be a open parenthesis: '('
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1309 (,,(,)); no nodes are named
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1310 (A,B,(C,D)); leaf nodes are named
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1311 (A,B,(C,D)E)F; all nodes are named
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1312 (:0.1,:0.2,(:0.3,:0.4):0.5); all but root node have a distance to parent
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1313 (:0.1,:0.2,(:0.3,:0.4):0.5):0.0; all have a distance to parent
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1314 (A:0.1,B:0.2,(C:0.3,D:0.4):0.5); distances and leaf names (popular)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1315 (A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F; distances and all names
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1316 ((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A; a tree rooted on a leaf node (rare)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1317 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1318 if not os.path.exists(filename):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1319 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1320 try:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1321 ## For now, guess this is a Newick file if it starts with a '(' and ends with a ';'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1322 flen = os.path.getsize(filename)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1323 fh = open( filename )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1324 len = min(flen,2000)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1325 # check end of the file for a semicolon
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1326 fh.seek(-len,os.SEEK_END)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1327 buf = fh.read(len).strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1328 buf = buf.strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1329 if not buf.endswith(';'):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1330 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1331 # See if this starts with a open parenthesis
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1332 if len < flen:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1333 fh.seek(0)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1334 buf = fh.read(len).strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1335 if buf.startswith('('):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1336 return True
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1337 except:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1338 pass
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1339 finally:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1340 close(fh)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1341 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1342
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1343 class Nhx( Newick ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1344 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1345 New Hampshire eXtended Newick with embedded
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1346 The Newick Standard for representing trees in computer-readable form makes use of the correspondence between trees and nested parentheses.
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1347 http://evolution.genetics.washington.edu/phylip/newicktree.html
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1348 http://en.wikipedia.org/wiki/Newick_format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1349 Example:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1350 (gene1_Hu[&&NHX:S=Hu_Homo_sapiens], (gene2_Hu[&&NHX:S=Hu_Homo_sapiens], gene2_Mu[&&NHX:S=Mu_Mus_musculus]));
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1351 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1352 file_ext = 'nhx'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1353
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1354 class Nexus( Text ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1355 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1356 http://en.wikipedia.org/wiki/Nexus_file
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1357 Example:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1358 #NEXUS
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1359 BEGIN TAXA;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1360 Dimensions NTax=4;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1361 TaxLabels fish frog snake mouse;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1362 END;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1363
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1364 BEGIN CHARACTERS;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1365 Dimensions NChar=20;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1366 Format DataType=DNA;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1367 Matrix
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1368 fish ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1369 frog ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1370 snake ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1371 mouse ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1372 END;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1373
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1374 BEGIN TREES;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1375 Tree best=(fish, (frog, (snake, mouse)));
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1376 END;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1377 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1378 file_ext = 'nex'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1379
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1380 def __init__(self, **kwd):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1381 Text.__init__( self, **kwd )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1382
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1383 def sniff( self, filename ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1384 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1385 Determines whether the file is in nexus format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1386 First line should be:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1387 #NEXUS
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1388 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1389 try:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1390 fh = open( filename )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1391 count = 0
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1392 line = fh.readline()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1393 line = line.strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1394 if line and line == '#NEXUS':
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1395 fh.close()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1396 return True
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1397 except:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1398 pass
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1399 finally:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1400 fh.close()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1401 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1402
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1403
26
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1404 ## Biom
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1405
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1406 class BiologicalObservationMatrix( Text ):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1407 file_ext = 'biom'
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1408 """
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1409 http://biom-format.org/documentation/biom_format.html
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1410 The format of the file is JSON:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1411 {
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1412 "id":null,
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1413 "format": "Biological Observation Matrix 0.9.1-dev",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1414 "format_url": "http://biom-format.org",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1415 "type": "OTU table",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1416 "generated_by": "QIIME revision 1.4.0-dev",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1417 "date": "2011-12-19T19:00:00",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1418 "rows":[
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1419 {"id":"GG_OTU_1", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1420 {"id":"GG_OTU_2", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1421 {"id":"GG_OTU_3", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1422 ],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1423 "columns": [
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1424 {"id":"Sample1", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1425 {"id":"Sample2", "metadata":null}
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1426 ],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1427 "matrix_type": "sparse",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1428 "matrix_element_type": "int",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1429 "shape": [3, 2],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1430 "data":[[0,1,1],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1431 [1,0,5],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1432 [2,1,4]
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1433 ]
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1434 }
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1435
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1436 """
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1437
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1438 def __init__(self, **kwd):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1439 Text.__init__( self, **kwd )
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1440
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1441 def sniff( self, filename ):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1442 if os.path.getsize(filename) < 50000:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1443 try:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1444 data = simplejson.load(open(filename))
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1445 if data['format'].find('Biological Observation Matrix'):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1446 return True
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1447 except:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1448 pass
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1449 return False
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1450
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1451
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1452
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1453
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1454 ## Qiime Classes
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1455
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1456 class QiimeMetadataMapping(Tabular):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1457 MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] )
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1458 file_ext = 'qiimemapping'
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1459
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1460 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1461 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1462 http://qiime.sourceforge.net/documentation/file_formats.html#mapping-file-overview
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1463 Information about the samples necessary to perform the data analysis.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1464 # self.column_names = ['#SampleID','BarcodeSequence','LinkerPrimerSequence','Description']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1465 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1466 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1467
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1468 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1469 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1470 Determines whether the file is a qiime mapping file
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1471 Just checking for an appropriate header line for now, could be improved
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1472 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1473 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1474 pat = '#SampleID(\t[a-zA-Z][a-zA-Z0-9_]*)*\tDescription'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1475 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1476 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1477 line = dataset_fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1478 if re.match(pat,line):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1479 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1480 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1481 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1482 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1483 close(fh)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1484 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1485
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1486 def set_column_names(self, dataset):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1487 if dataset.has_data():
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1488 dataset_fh = open( dataset.file_name )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1489 line = dataset_fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1490 if line.startswith('#SampleID'):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1491 dataset.metadata.column_names = line.strip().split('\t');
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1492 dataset_fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1493
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1494 def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1495 Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1496 self.set_column_names(dataset)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1497
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1498 class QiimeOTU(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1499 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1500 Associates OTUs with sequence IDs
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1501 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1502 0 FLP3FBN01C2MYD FLP3FBN01B2ALM
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1503 1 FLP3FBN01DF6NE FLP3FBN01CKW1J FLP3FBN01CHVM4
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1504 2 FLP3FBN01AXQ2Z
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1505 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1506 file_ext = 'qiimeotu'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1507
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1508 class QiimeOTUTable(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1509 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1510 #Full OTU Counts
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1511 #OTU ID PC.354 PC.355 PC.356 Consensus Lineage
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1512 0 0 1 0 Root;Bacteria;Firmicutes;"Clostridia";Clostridiales
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1513 1 1 3 1 Root;Bacteria
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1514 2 0 2 2 Root;Bacteria;Bacteroidetes
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1515 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1516 MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1517 file_ext = 'qiimeotutable'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1518 def init_meta( self, dataset, copy_from=None ):
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
1519 Tabular.init_meta( self, dataset, copy_from=copy_from )
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1520 def set_meta( self, dataset, overwrite = True, skip = None, **kwd ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1521 self.set_column_names(dataset)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1522 def set_column_names(self, dataset):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1523 if dataset.has_data():
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1524 dataset_fh = open( dataset.file_name )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1525 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1526 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1527 if line.startswith('#OTU ID'):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1528 dataset.metadata.column_names = line.strip().split('\t');
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1529 dataset_fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1530 dataset.metadata.comment_lines = 2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1531
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1532 class QiimeDistanceMatrix(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1533 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1534 PC.354 PC.355 PC.356
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1535 PC.354 0.0 3.177 1.955
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1536 PC.355 3.177 0.0 3.444
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1537 PC.356 1.955 3.444 0.0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1538 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1539 file_ext = 'qiimedistmat'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1540 def init_meta( self, dataset, copy_from=None ):
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
1541 Tabular.init_meta( self, dataset, copy_from=copy_from )
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1542 def set_meta( self, dataset, overwrite = True, skip = None, **kwd ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1543 self.set_column_names(dataset)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1544 def set_column_names(self, dataset):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1545 if dataset.has_data():
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1546 dataset_fh = open( dataset.file_name )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1547 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1548 # first line contains the names
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1549 dataset.metadata.column_names = line.strip().split('\t');
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1550 dataset_fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1551 dataset.metadata.comment_lines = 1
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1552
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1553 class QiimePCA(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1554 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1555 Principal Coordinate Analysis Data
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1556 The principal coordinate (PC) axes (columns) for each sample (rows).
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1557 Pairs of PCs can then be graphed to view the relationships between samples.
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1558 The bottom of the output file contains the eigenvalues and % variation explained for each PC.
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1559 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1560 pc vector number 1 2 3
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1561 PC.354 -0.309063936588 0.0398252112257 0.0744672231759
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1562 PC.355 -0.106593922619 0.141125998277 0.0780204374172
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1563 PC.356 -0.219869362955 0.00917241121781 0.0357281314115
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1564
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1565
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1566 eigvals 0.480220500471 0.163567082874 0.125594470811
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1567 % variation explained 51.6955484555 17.6079322939
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1568 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1569 file_ext = 'qiimepca'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1570
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1571 class QiimeParams(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1572 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1573 ###pick_otus_through_otu_table.py parameters###
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1574
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1575 # OTU picker parameters
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1576 pick_otus:otu_picking_method uclust
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1577 pick_otus:clustering_algorithm furthest
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1578
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1579 # Representative set picker parameters
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1580 pick_rep_set:rep_set_picking_method first
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1581 pick_rep_set:sort_by otu
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1582 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1583 file_ext = 'qiimeparams'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1584
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1585 class QiimePrefs(Text):
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1586 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1587 A text file, containing coloring preferences to be used by make_distance_histograms.py, make_2d_plots.py and make_3d_plots.py.
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1588 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1589 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1590 'background_color':'black',
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1591
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1592 'sample_coloring':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1593 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1594 'Treatment':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1595 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1596 'column':'Treatment',
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1597 'colors':(('red',(0,100,100)),('blue',(240,100,100)))
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1598 },
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1599 'DOB':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1600 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1601 'column':'DOB',
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1602 'colors':(('red',(0,100,100)),('blue',(240,100,100)))
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1603 }
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1604 },
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1605 'MONTE_CARLO_GROUP_DISTANCES':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1606 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1607 'Treatment': 10,
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1608 'DOB': 10
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1609 }
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1610 }
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1611 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1612 file_ext = 'qiimeprefs'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1613
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1614 class QiimeTaxaSummary(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1615 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1616 Taxon PC.354 PC.355 PC.356
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1617 Root;Bacteria;Actinobacteria 0.0 0.177 0.955
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1618 Root;Bacteria;Firmicutes 0.177 0.0 0.444
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1619 Root;Bacteria;Proteobacteria 0.955 0.444 0.0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1620 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1621 MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1622 file_ext = 'qiimetaxsummary'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1623
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1624 def set_column_names(self, dataset):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1625 if dataset.has_data():
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1626 dataset_fh = open( dataset.file_name )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1627 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1628 if line.startswith('Taxon'):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1629 dataset.metadata.column_names = line.strip().split('\t');
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1630 dataset_fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1631
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1632 def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1633 Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1634 self.set_column_names(dataset)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1635
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1636 if __name__ == '__main__':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1637 import doctest, sys
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1638 doctest.testmod(sys.modules[__name__])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1639