annotate mothur/lib/galaxy/datatypes/metagenomics.py @ 30:a90d1915a176

metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
author Jim Johnson <jj@umn.edu>
date Thu, 30 May 2013 08:59:17 -0500
parents 9c0cd3b92295
children ec8df51e841a
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
2 metagenomics datatypes
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
3 James E Johnson - University of Minnesota
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
4 for Mothur
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
5 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
6
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
7 import logging, os, os.path, sys, time, tempfile, shutil, string, glob, re
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
8 import galaxy.model
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
9 from galaxy.datatypes.sniff import *
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
10 from galaxy.datatypes.metadata import MetadataElement
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
11 from galaxy.datatypes.data import Text
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
12 from galaxy.datatypes.tabular import Tabular
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
13 from galaxy.datatypes.sequence import Fasta
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
14 from galaxy import util
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
15 from galaxy.datatypes.images import Html
26
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
16 import pkg_resources
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
17 pkg_resources.require("simplejson")
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
18 import simplejson
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
19
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
20 log = logging.getLogger(__name__)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
21
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
22 ## Mothur Classes
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
23
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
24 class Otu( Text ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
25 file_ext = 'otu'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
26 MetadataElement( name="columns", default=0, desc="Number of columns", readonly=True, visible=True, no_value=0 )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
27 MetadataElement( name="labels", default=[], desc="Label Names", readonly=True, visible=True, no_value=[] )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
28 def __init__(self, **kwd):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
29 Text.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
30 def set_meta( self, dataset, overwrite = True, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
31 if dataset.has_data():
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
32 label_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
33 ncols = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
34 data_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
35 comment_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
36 try:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
37 fh = open( dataset.file_name )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
38 for line in fh:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
39 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
40 if len(fields) >= 2:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
41 data_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
42 ncols = max(ncols,len(fields))
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
43 label_names.add(fields[0])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
44 else:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
45 comment_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
46 # Set the discovered metadata values for the dataset
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
47 dataset.metadata.data_lines = data_lines
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
48 dataset.metadata.columns = ncols
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
49 dataset.metadata.labels = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
50 dataset.metadata.labels += label_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
51 dataset.metadata.labels.sort()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
52 finally:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
53 fh.close()
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
54
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
55 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
56 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
57 Determines whether the file is a otu (operational taxonomic unit) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
58 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
59 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
60 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
61 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
62 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
63 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
64 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
65 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
66 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
67 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
68 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
69 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
70 if len(linePieces) < 2:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
71 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
72 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
73 check = int(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
74 if check + 2 != len(linePieces):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
75 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
76 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
77 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
78 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
79 if count == 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
80 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
81 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
82 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
83 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
84 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
85 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
86 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
87 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
88 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
89
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
90 class OtuList( Otu ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
91 file_ext = 'list'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
92 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
93 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
94 # http://www.mothur.org/wiki/List_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
95 The first column is a label that represents the distance that sequences were assigned to OTUs.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
96 The number in the second column is the number of OTUs that have been formed.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
97 Subsequent columns contain the names of sequences within each OTU separated by a comma.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
98 distance_label otu_count OTU1 OTU2 OTUn
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
99 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
100 Otu.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
101 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
102 Otu.init_meta( self, dataset, copy_from=copy_from )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
103 def set_meta( self, dataset, overwrite = True, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
104 Otu.set_meta(self,dataset, overwrite = True, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
105 """
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
106 # too many columns to be stored in metadata
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
107 if dataset != None and dataset.metadata.columns > 2:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
108 for i in range(2,dataset.metadata.columns):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
109 dataset.metadata.column_types[i] = 'str'
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
110 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
111
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
112 class Sabund( Otu ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
113 file_ext = 'sabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
114 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
115 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
116 # http://www.mothur.org/wiki/Sabund_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
117 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
118 Otu.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
119 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
120 Otu.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
121 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
122 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
123 Determines whether the file is a otu (operational taxonomic unit) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
124 label<TAB>count[<TAB>value(1..n)]
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
125
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
126 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
127 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
128 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
129 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
130 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
131 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
132 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
133 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
134 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
135 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
136 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
137 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
138 if len(linePieces) < 2:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
139 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
140 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
141 check = int(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
142 if check + 2 != len(linePieces):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
143 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
144 for i in range( 2, len(linePieces)):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
145 ival = int(linePieces[i])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
146 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
147 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
148 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
149 if count >= 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
150 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
151 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
152 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
153 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
154 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
155 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
156 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
157 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
158 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
159
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
160 class Rabund( Sabund ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
161 file_ext = 'rabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
162 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
163 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
164 # http://www.mothur.org/wiki/Rabund_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
165 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
166 Sabund.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
167 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
168 Sabund.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
169
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
170 class GroupAbund( Otu ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
171 file_ext = 'grpabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
172 MetadataElement( name="groups", default=[], desc="Group Names", readonly=True, visible=True, no_value=[] )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
173 def __init__(self, **kwd):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
174 Otu.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
175 # self.column_names[0] = ['label']
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
176 # self.column_names[1] = ['group']
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
177 # self.column_names[2] = ['count']
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
178 """
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
179 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
180 Otu.init_meta( self, dataset, copy_from=copy_from )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
181 """
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
182 def init_meta( self, dataset, copy_from=None ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
183 Otu.init_meta( self, dataset, copy_from=copy_from )
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
184 def set_meta( self, dataset, overwrite = True, skip=1, max_data_lines = 100000, **kwd ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
185 # See if file starts with header line
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
186 if dataset.has_data():
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
187 label_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
188 group_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
189 data_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
190 comment_lines = 0
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
191 ncols = 0
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
192 try:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
193 fh = open( dataset.file_name )
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
194 line = fh.readline()
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
195 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
196 ncols = max(ncols,len(fields))
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
197 if fields[0] == 'label' and fields[1] == 'Group':
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
198 skip=1
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
199 comment_lines += 1
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
200 else:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
201 skip=0
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
202 data_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
203 label_names.add(fields[0])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
204 group_names.add(fields[1])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
205 for line in fh:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
206 data_lines += 1
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
207 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
208 ncols = max(ncols,len(fields))
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
209 label_names.add(fields[0])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
210 group_names.add(fields[1])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
211 # Set the discovered metadata values for the dataset
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
212 dataset.metadata.data_lines = data_lines
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
213 dataset.metadata.columns = ncols
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
214 dataset.metadata.labels = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
215 dataset.metadata.labels += label_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
216 dataset.metadata.labels.sort()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
217 dataset.metadata.groups = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
218 dataset.metadata.groups += group_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
219 dataset.metadata.groups.sort()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
220 dataset.metadata.skip = skip
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
221 finally:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
222 fh.close()
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
223
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
224 def sniff( self, filename, vals_are_int=False):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
225 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
226 Determines whether the file is a otu (operational taxonomic unit) Shared format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
227 label<TAB>group<TAB>count[<TAB>value(1..n)]
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
228 The first line is column headings as of Mothur v 1.20
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
229 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
230 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
231 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
232 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
233 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
234 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
235 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
236 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
237 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
238 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
239 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
240 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
241 if len(linePieces) < 3:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
242 return False
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
243 if count > 0 or linePieces[0] != 'label':
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
244 try:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
245 check = int(linePieces[2])
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
246 if check + 3 != len(linePieces):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
247 return False
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
248 for i in range( 3, len(linePieces)):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
249 if vals_are_int:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
250 ival = int(linePieces[i])
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
251 else:
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
252 fval = float(linePieces[i])
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
253 except ValueError:
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
254 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
255 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
256 if count >= 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
257 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
258 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
259 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
260 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
261 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
262 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
263 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
264 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
265 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
266
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
267 class SharedRabund( GroupAbund ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
268 file_ext = 'shared'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
269 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
270 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
271 # http://www.mothur.org/wiki/Shared_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
272 A shared file is analogous to an rabund file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
273 The data in a shared file represent the number of times that an OTU is observed in multiple samples.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
274 The structure of a shared file is analogous to an rabund file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
275 The first column contains the label for the comparison - this will be the value for the first column of each line from the original list file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
276 The second column contains the group name that designates where the data is coming from for that row.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
277 The third column is the number of OTUs that were found between each of the groups and is the number of columns that follow.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
278 Finally, the remaining columns indicate the number of sequences that belonged to each OTU from that group.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
279 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
280 GroupAbund.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
281 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
282 GroupAbund.init_meta( self, dataset, copy_from=copy_from )
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
283 def sniff( self, filename ):
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
284 """
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
285 Determines whether the file is a otu (operational taxonomic unit) Shared format
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
286 label<TAB>group<TAB>count[<TAB>value(1..n)]
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
287 The first line is column headings as of Mothur v 1.20
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
288 """
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
289 # return GroupAbund.sniff(self,filename,True)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
290 isme = GroupAbund.sniff(self,filename,True)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
291 return isme
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
292
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
293
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
294 class RelAbund( GroupAbund ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
295 file_ext = 'relabund'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
296 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
297 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
298 # http://www.mothur.org/wiki/Relabund_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
299 The structure of a relabund file is analogous to an shared file.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
300 The first column contains the label for the comparison - this will be the value for the first column of each line from the original list file (e.g. final.an.list).
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
301 The second column contains the group name that designates where the data is coming from for that row. Next is the number of OTUs that were found between each of the groups and is the number of columns that follow.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
302 Finally, the remaining columns indicate the relative abundance of each OTU from that group.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
303 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
304 GroupAbund.__init__( self, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
305 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
306 GroupAbund.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
307 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
308 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
309 Determines whether the file is a otu (operational taxonomic unit) Relative Abundance format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
310 label<TAB>group<TAB>count[<TAB>value(1..n)]
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
311 The first line is column headings as of Mothur v 1.20
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
312 """
7
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
313 # return GroupAbund.sniff(self,filename,False)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
314 isme = GroupAbund.sniff(self,filename,False)
7bfe1f843858 Support Mothur v1.20
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
315 return isme
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
316
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
317 class SecondaryStructureMap(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
318 file_ext = 'map'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
319 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
320 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
321 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
322 self.column_names = ['Map']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
323
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
324 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
325 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
326 Determines whether the file is a secondary structure map format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
327 A single column with an integer value which indicates the row that this row maps to.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
328 check you make sure is structMap[10] = 380 then structMap[380] = 10.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
329 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
330 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
331 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
332 line_num = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
333 rowidxmap = {}
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
334 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
335 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
336 line_num += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
337 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
338 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
339 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
340 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
341 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
342 pointer = int(line)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
343 if pointer > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
344 if pointer > line_num:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
345 rowidxmap[line_num] = pointer
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
346 elif pointer < line_num & rowidxmap[pointer] != line_num:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
347 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
348 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
349 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
350 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
351 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
352 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
353 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
354 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
355 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
356 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
357 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
358
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
359 class SequenceAlignment( Fasta ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
360 file_ext = 'align'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
361 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
362 Fasta.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
363 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
364
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
365 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
366 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
367 Determines whether the file is in Mothur align fasta format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
368 Each sequence line must be the same length
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
369 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
370
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
371 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
372 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
373 len = -1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
374 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
375 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
376 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
377 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
378 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
379 if line: #first non-empty line
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
380 if line.startswith( '>' ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
381 #The next line.strip() must not be '', nor startwith '>'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
382 line = fh.readline().strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
383 if line == '' or line.startswith( '>' ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
384 break
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
385 if len < 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
386 len = len(line)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
387 elif len != len(line):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
388 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
389 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
390 break #we found a non-empty line, but its not a fasta header
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
391 if len > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
392 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
393 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
394 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
395 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
396 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
397 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
398
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
399 class AlignCheck( Tabular ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
400 file_ext = 'align.check'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
401 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
402 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
403 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
404 self.column_names = ['name','pound','dash','plus','equal','loop','tilde','total']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
405 self.column_types = ['str','int','int','int','int','int','int','int']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
406 self.comment_lines = 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
407
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
408 def set_meta( self, dataset, overwrite = True, **kwd ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
409 # Tabular.set_meta( self, dataset, overwrite = overwrite, first_line_is_header = True, skip = 1 )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
410 data_lines = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
411 if dataset.has_data():
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
412 dataset_fh = open( dataset.file_name )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
413 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
414 line = dataset_fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
415 if not line: break
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
416 data_lines += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
417 dataset_fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
418 dataset.metadata.comment_lines = 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
419 dataset.metadata.data_lines = data_lines - 1 if data_lines > 0 else 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
420 dataset.metadata.column_names = self.column_names
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
421 dataset.metadata.column_types = self.column_types
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
422
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
423 class AlignReport(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
424 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
425 QueryName QueryLength TemplateName TemplateLength SearchMethod SearchScore AlignmentMethod QueryStart QueryEnd TemplateStart TemplateEnd PairwiseAlignmentLength GapsInQuery GapsInTemplate LongestInsert SimBtwnQuery&Template
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
426 AY457915 501 82283 1525 kmer 89.07 needleman 5 501 1 499 499 2 0 0 97.6
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
427 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
428 file_ext = 'align.report'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
429 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
430 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
431 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
432 self.column_names = ['QueryName','QueryLength','TemplateName','TemplateLength','SearchMethod','SearchScore',
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
433 'AlignmentMethod','QueryStart','QueryEnd','TemplateStart','TemplateEnd',
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
434 'PairwiseAlignmentLength','GapsInQuery','GapsInTemplate','LongestInsert','SimBtwnQuery&Template'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
435 ]
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
436
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
437 class BellerophonChimera( Tabular ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
438 file_ext = 'bellerophon.chimera'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
439 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
440 """Initialize AlignCheck datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
441 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
442 self.column_names = ['Name','Score','Left','Right']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
443
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
444 class SecondaryStructureMatch(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
445 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
446 name pound dash plus equal loop tilde total
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
447 9_1_12 42 68 8 28 275 420 872
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
448 9_1_14 36 68 6 26 266 422 851
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
449 9_1_15 44 68 8 28 276 418 873
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
450 9_1_16 34 72 6 30 267 430 860
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
451 9_1_18 46 80 2 36 261
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
452 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
453 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
454 """Initialize SecondaryStructureMatch datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
455 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
456 self.column_names = ['name','pound','dash','plus','equal','loop','tilde','total']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
457
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
458 class DistanceMatrix( Text ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
459 file_ext = 'dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
460 """Add metadata elements"""
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
461 MetadataElement( name="sequence_count", default=0, desc="Number of sequences", readonly=True, visible=True, optional=True, no_value='?' )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
462
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
463 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
464 Text.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
465
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
466 def set_meta( self, dataset, overwrite = True, skip = 0, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
467 Text.set_meta(self, dataset,overwrite = overwrite, skip = skip, **kwd )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
468 try:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
469 fh = open( dataset.file_name )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
470 line = fh.readline().strip().strip()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
471 dataset.metadata.sequence_count = int(line)
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
472 except Exception, e:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
473 log.warn("DistanceMatrix set_meta %s" % e)
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
474 finally:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
475 fh.close()
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
476
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
477 class LowerTriangleDistanceMatrix(DistanceMatrix):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
478 file_ext = 'lower.dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
479 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
480 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
481 DistanceMatrix.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
482
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
483 def init_meta( self, dataset, copy_from=None ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
484 DistanceMatrix.init_meta( self, dataset, copy_from=copy_from )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
485
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
486 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
487 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
488 Determines whether the file is a lower-triangle distance matrix (phylip) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
489 The first line has the number of sequences in the matrix.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
490 The remaining lines have the sequence name followed by a list of distances from all preceeding sequences
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
491 5
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
492 U68589
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
493 U68590 0.3371
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
494 U68591 0.3609 0.3782
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
495 U68592 0.4155 0.3197 0.4148
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
496 U68593 0.2872 0.1690 0.3361 0.2842
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
497 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
498 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
499 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
500 count = 0
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
501 line = fh.readline()
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
502 sequence_count = int(line.strip())
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
503 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
504 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
505 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
506 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
507 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
508 if line:
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
509 # Split into fields
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
510 linePieces = line.split('\t')
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
511 # Each line should have the same number of
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
512 # fields as the Python line index
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
513 linePieces = line.split('\t')
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
514 if len(linePieces) != (count + 1):
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
515 return False
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
516 # Distances should be floats
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
517 try:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
518 for linePiece in linePieces[2:]:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
519 check = float(linePiece)
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
520 except ValueError:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
521 return False
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
522 # Increment line counter
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
523 count += 1
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
524 # Only check first 5 lines
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
525 if count == 5:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
526 return True
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
527 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
528 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
529 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
530 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
531 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
532 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
533 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
534 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
535
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
536 class SquareDistanceMatrix(DistanceMatrix):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
537 file_ext = 'square.dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
538
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
539 def __init__(self, **kwd):
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
540 DistanceMatrix.__init__( self, **kwd )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
541 def init_meta( self, dataset, copy_from=None ):
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
542 DistanceMatrix.init_meta( self, dataset, copy_from=copy_from )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
543
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
544 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
545 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
546 Determines whether the file is a square distance matrix (Column-formatted distance matrix) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
547 The first line has the number of sequences in the matrix.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
548 The following lines have the sequence name in the first column plus a column for the distance to each sequence
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
549 in the row order in which they appear in the matrix.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
550 3
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
551 U68589 0.0000 0.3371 0.3610
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
552 U68590 0.3371 0.0000 0.3783
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
553 U68590 0.3371 0.0000 0.3783
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
554 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
555 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
556 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
557 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
558 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
559 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
560 sequence_count = int(line)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
561 col_cnt = seq_cnt + 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
562 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
563 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
564 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
565 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
566 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
567 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
568 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
569 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
570 if len(linePieces) != col_cnt :
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
571 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
572 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
573 for i in range(1, col_cnt):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
574 check = float(linePieces[i])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
575 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
576 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
577 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
578 if count == 5:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
579 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
580 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
581 if count < 5 and count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
582 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
583 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
584 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
585 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
586 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
587 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
588
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
589 class PairwiseDistanceMatrix(DistanceMatrix,Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
590 file_ext = 'pair.dist'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
591 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
592 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
593 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
594 self.column_names = ['Sequence','Sequence','Distance']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
595 self.column_types = ['str','str','float']
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
596 def set_meta( self, dataset, overwrite = True, skip = None, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
597 Tabular.set_meta(self, dataset,overwrite = overwrite, skip = skip, **kwd )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
598
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
599 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
600 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
601 Determines whether the file is a pairwise distance matrix (Column-formatted distance matrix) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
602 The first and second columns have the sequence names and the third column is the distance between those sequences.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
603 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
604 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
605 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
606 count = 0
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
607 all_ints = True
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
608 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
609 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
610 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
611 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
612 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
613 if line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
614 if line[0] != '@':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
615 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
616 if len(linePieces) != 3:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
617 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
618 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
619 check = float(linePieces[2])
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
620 try:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
621 # See if it's also an integer
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
622 check_int = int(linePieces[2])
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
623 except ValueError:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
624 # At least one value is not an
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
625 # integer
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
626 all_ints = False
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
627 except ValueError:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
628 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
629 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
630 if count == 5:
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
631 if not all_ints:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
632 return True
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
633 else:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
634 return False
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
635 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
636 if count < 5 and count > 0:
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
637 if not all_ints:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
638 return True
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
639 else:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
640 return False
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
641 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
642 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
643 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
644 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
645 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
646
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
647 class AlignCheck(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
648 file_ext = 'align.check'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
649 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
650 """Initialize secondary structure map datatype"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
651 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
652 self.column_names = ['name','pound','dash','plus','equal','loop','tilde','total']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
653 self.columns = 8
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
654
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
655 class Names(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
656 file_ext = 'names'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
657 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
658 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
659 # http://www.mothur.org/wiki/Name_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
660 Name file shows the relationship between a representative sequence(col 1) and the sequences(comma-separated) it represents(col 2)
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
661 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
662 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
663 self.column_names = ['name','representatives']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
664 self.columns = 2
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
665
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
666 class Summary(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
667 file_ext = 'summary'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
668 def __init__(self, **kwd):
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
669 """summarizes the quality of sequences in an unaligned or aligned fasta-formatted sequence file"""
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
670 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
671 self.column_names = ['seqname','start','end','nbases','ambigs','polymer']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
672 self.columns = 6
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
673
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
674 class Group(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
675 file_ext = 'groups'
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
676 MetadataElement( name="groups", default=[], desc="Group Names", readonly=True, visible=True, no_value=[] )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
677 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
678 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
679 # http://www.mothur.org/wiki/Groups_file
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
680 Group file assigns sequence (col 1) to a group (col 2)
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
681 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
682 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
683 self.column_names = ['name','group']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
684 self.columns = 2
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
685 def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ):
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
686 Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines)
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
687 group_names = set()
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
688 try:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
689 fh = open( dataset.file_name )
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
690 for line in fh:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
691 fields = line.strip().split('\t')
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
692 group_names.add(fields[1])
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
693 dataset.metadata.groups = []
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
694 dataset.metadata.groups += group_names
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
695 finally:
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
696 fh.close()
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
697
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
698 class Design(Group):
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
699 file_ext = 'design'
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
700 def __init__(self, **kwd):
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
701 """
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
702 # http://www.mothur.org/wiki/Design_File
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
703 Design file shows the relationship between a group(col 1) and a grouping (col 2), providing a way to merge groups.
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
704 """
25
bfbaf823be4c Change metagenomics datatypes to include labels and groups metadata. change Mothur tool configs to get label and group select options from a data_meta filter rather than using the options from_dataset attribute. This grealty decreases memory demand for the galaxy server.
Jim Johnson <jj@umn.edu>
parents: 17
diff changeset
705 Group.__init__( self, **kwd )
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
706
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
707 class AccNos(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
708 file_ext = 'accnos'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
709 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
710 """A list of names"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
711 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
712 self.column_names = ['name']
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
713 self.columns = 1
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
714
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
715 class Oligos( Text ):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
716 file_ext = 'oligos'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
717
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
718 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
719 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
720 # http://www.mothur.org/wiki/Oligos_File
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
721 Determines whether the file is a otu (operational taxonomic unit) format
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
722 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
723 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
724 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
725 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
726 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
727 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
728 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
729 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
730 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
731 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
732 if line[0] != '#':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
733 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
734 if len(linePieces) == 2 and re.match('forward|reverse',linePieces[0]):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
735 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
736 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
737 elif len(linePieces) == 3 and re.match('barcode',linePieces[0]):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
738 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
739 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
740 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
741 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
742 if count > 20:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
743 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
744 if count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
745 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
746 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
747 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
748 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
749 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
750 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
751
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
752 class Frequency(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
753 file_ext = 'freq'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
754 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
755 """A list of names"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
756 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
757 self.column_names = ['position','frequency']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
758 self.column_types = ['int','float']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
759
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
760 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
761 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
762 Determines whether the file is a frequency tabular format for chimera analysis
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
763 #1.14.0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
764 0 0.000
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
765 1 0.000
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
766 ...
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
767 155 0.975
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
768 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
769 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
770 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
771 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
772 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
773 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
774 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
775 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
776 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
777 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
778 if line[0] != '#':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
779 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
780 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
781 i = int(linePieces[0])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
782 f = float(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
783 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
784 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
785 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
786 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
787 if count > 20:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
788 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
789 if count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
790 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
791 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
792 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
793 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
794 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
795 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
796
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
797 class Quantile(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
798 file_ext = 'quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
799 MetadataElement( name="filtered", default=False, no_value=False, optional=True , desc="Quantiles calculated using a mask", readonly=True)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
800 MetadataElement( name="masked", default=False, no_value=False, optional=True , desc="Quantiles calculated using a frequency filter", readonly=True)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
801 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
802 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
803 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
804 self.column_names = ['num','ten','twentyfive','fifty','seventyfive','ninetyfive','ninetynine']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
805 self.column_types = ['int','float','float','float','float','float','float']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
806 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
807 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
808 Determines whether the file is a quantiles tabular format for chimera analysis
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
809 1 0 0 0 0 0 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
810 2 0.309198 0.309198 0.37161 0.37161 0.37161 0.37161
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
811 3 0.510982 0.563213 0.693529 0.858939 1.07442 1.20608
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
812 ...
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
813 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
814 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
815 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
816 count = 0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
817 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
818 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
819 line = line.strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
820 if not line:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
821 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
822 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
823 if line[0] != '#':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
824 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
825 linePieces = line.split('\t')
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
826 i = int(linePieces[0])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
827 f = float(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
828 f = float(linePieces[2])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
829 f = float(linePieces[3])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
830 f = float(linePieces[4])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
831 f = float(linePieces[5])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
832 f = float(linePieces[6])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
833 count += 1
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
834 continue
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
835 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
836 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
837 if count > 10:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
838 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
839 if count > 0:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
840 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
841 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
842 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
843 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
844 fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
845 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
846
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
847 class FilteredQuantile(Quantile):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
848 file_ext = 'filtered.quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
849 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
850 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
851 Quantile.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
852 self.filtered = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
853
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
854 class MaskedQuantile(Quantile):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
855 file_ext = 'masked.quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
856 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
857 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
858 Quantile.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
859 self.masked = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
860 self.filtered = False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
861
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
862 class FilteredMaskedQuantile(Quantile):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
863 file_ext = 'filtered.masked.quan'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
864 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
865 """Quantiles for chimera analysis"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
866 Quantile.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
867 self.masked = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
868 self.filtered = True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
869
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
870 class LaneMask(Text):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
871 file_ext = 'filter'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
872
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
873 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
874 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
875 Determines whether the file is a lane mask filter: 1 line consisting of zeros and ones.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
876 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
877 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
878 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
879 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
880 buff = fh.read(1000)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
881 if not buff:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
882 break #EOF
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
883 else:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
884 if not re.match('^[01]+$',line):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
885 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
886 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
887 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
888 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
889 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
890 close(fh)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
891 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
892
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
893 class CountTable(Tabular):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
894 MetadataElement( name="groups", default=[], desc="Group Names", readonly=True, visible=True, no_value=[] )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
895 file_ext = 'count_table'
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
896
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
897 def __init__(self, **kwd):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
898 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
899 # http://www.mothur.org/wiki/Count_File
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
900 A table with first column names and following columns integer counts
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
901 # Example 1:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
902 Representative_Sequence total
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
903 U68630 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
904 U68595 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
905 U68600 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
906 # Example 2 (with group columns):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
907 Representative_Sequence total forest pasture
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
908 U68630 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
909 U68595 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
910 U68600 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
911 U68591 1 1 0
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
912 U68647 1 0 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
913 """
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
914 Tabular.__init__( self, **kwd )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
915 self.column_names = ['name','total']
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
916
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
917 def set_meta( self, dataset, overwrite = True, skip = 1, max_data_lines = None, **kwd ):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
918 try:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
919 data_lines = 0;
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
920 fh = open( dataset.file_name )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
921 line = fh.readline()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
922 if line:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
923 line = line.strip()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
924 colnames = line.split()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
925 if len(colnames) > 1:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
926 dataset.metadata.columns = len( colnames )
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
927 if len(colnames) > 2:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
928 dataset.metadata.groups = colnames[2:]
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
929 column_types = ['str']
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
930 for i in range(1,len(colnames)):
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
931 column_types.append('int')
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
932 dataset.metadata.column_types = column_types
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
933 dataset.metadata.comment_lines = 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
934 while line:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
935 line = fh.readline()
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
936 if not line: break
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
937 data_lines += 1
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
938 dataset.metadata.data_lines = data_lines
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
939 finally:
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
940 close(fh)
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
941
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
942 class RefTaxonomy(Tabular):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
943 file_ext = 'ref.taxonomy'
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
944 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
945 # http://www.mothur.org/wiki/Taxonomy_outline
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
946 A table with 2 or 3 columns:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
947 - SequenceName
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
948 - Taxonomy (semicolon-separated taxonomy in descending order)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
949 - integer ?
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
950 Example: 2-column ( http://www.mothur.org/wiki/Taxonomy_outline )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
951 X56533.1 Eukaryota;Alveolata;Ciliophora;Intramacronucleata;Oligohymenophorea;Hymenostomatida;Tetrahymenina;Glaucomidae;Glaucoma;
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
952 X97975.1 Eukaryota;Parabasalidea;Trichomonada;Trichomonadida;unclassified_Trichomonadida;
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
953 AF052717.1 Eukaryota;Parabasalidea;
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
954 Example: 3-column ( http://vamps.mbl.edu/resources/databases.php )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
955 v3_AA008 Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus 5
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
956 v3_AA016 Bacteria 120
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
957 v3_AA019 Archaea;Crenarchaeota;Marine_Group_I 1
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
958 """
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
959 def __init__(self, **kwd):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
960 Tabular.__init__( self, **kwd )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
961 self.column_names = ['name','taxonomy']
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
962
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
963 def sniff( self, filename ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
964 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
965 Determines whether the file is a Reference Taxonomy
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
966 """
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
967 try:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
968 pat = '^([^ \t\n\r\x0c\x0b;]+([(]\\d+[)])?(;[^ \t\n\r\x0c\x0b;]+([(]\\d+[)])?)*(;)?)$'
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
969 fh = open( filename )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
970 count = 0
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
971 # VAMPS taxonomy files do not require a semicolon after the last taxonomy category
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
972 # but assume assume the file will have some multi-level taxonomy assignments
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
973 found_semicolons = False
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
974 while True:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
975 line = fh.readline()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
976 if not line:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
977 break #EOF
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
978 line = line.strip()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
979 if line:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
980 fields = line.split('\t')
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
981 if not (2 <= len(fields) <= 3):
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
982 return False
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
983 if not re.match(pat,fields[1]):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
984 return False
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
985 if not found_semicolons and str(fields[1]).count(';') > 0:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
986 found_semicolons = True
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
987 if len(fields) == 3:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
988 check = int(fields[2])
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
989 count += 1
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
990 if count > 100:
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
991 break
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
992 if count > 0:
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
993 # This will be true if at least one entry
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
994 # has semicolons in the 2nd column
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
995 return found_semicolons
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
996 except:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
997 pass
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
998 finally:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
999 fh.close()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1000 return False
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1001
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1002 class SequenceTaxonomy(RefTaxonomy):
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1003 file_ext = 'seq.taxonomy'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1004 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1005 # http://www.mothur.org/wiki/Taxonomy_outline
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1006 A table with 2 columns:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1007 - SequenceName
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1008 - Taxonomy (semicolon-separated taxonomy in descending order)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1009 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1010 X56533.1 Eukaryota;Alveolata;Ciliophora;Intramacronucleata;Oligohymenophorea;Hymenostomatida;Tetrahymenina;Glaucomidae;Glaucoma;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1011 X97975.1 Eukaryota;Parabasalidea;Trichomonada;Trichomonadida;unclassified_Trichomonadida;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1012 AF052717.1 Eukaryota;Parabasalidea;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1013 """
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1014 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1015 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1016 self.column_names = ['name','taxonomy']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1017
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1018 def sniff( self, filename ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1019 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1020 Determines whether the file is a SequenceTaxonomy
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1021 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1022 try:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1023 pat = '^([^ \t\n\r\f\v;]+([(]\d+[)])?[;])+$'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1024 fh = open( filename )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1025 count = 0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1026 while True:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1027 line = fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1028 if not line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1029 break #EOF
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1030 line = line.strip()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1031 if line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1032 fields = line.split('\t')
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1033 if len(fields) != 2:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1034 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1035 if not re.match(pat,fields[1]):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1036 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1037 count += 1
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1038 if count > 10:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1039 break
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1040 if count > 0:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1041 return True
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1042 except:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1043 pass
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1044 finally:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1045 fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1046 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1047
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1048 class RDPSequenceTaxonomy(SequenceTaxonomy):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1049 file_ext = 'rdp.taxonomy'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1050 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1051 A table with 2 columns:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1052 - SequenceName
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1053 - Taxonomy (semicolon-separated taxonomy in descending order, RDP requires exactly 6 levels deep)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1054 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1055 AB001518.1 Bacteria;Bacteroidetes;Sphingobacteria;Sphingobacteriales;unclassified_Sphingobacteriales;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1056 AB001724.1 Bacteria;Cyanobacteria;Cyanobacteria;Family_II;GpIIa;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1057 AB001774.1 Bacteria;Chlamydiae;Chlamydiae;Chlamydiales;Chlamydiaceae;Chlamydophila;
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1058 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1059 def sniff( self, filename ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1060 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1061 Determines whether the file is a SequenceTaxonomy
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1062 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1063 try:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1064 pat = '^([^ \t\n\r\f\v;]+([(]\d+[)])?[;]){6}$'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1065 fh = open( filename )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1066 count = 0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1067 while True:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1068 line = fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1069 if not line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1070 break #EOF
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1071 line = line.strip()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1072 if line:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1073 fields = line.split('\t')
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1074 if len(fields) != 2:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1075 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1076 if not re.match(pat,fields[1]):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1077 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1078 count += 1
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1079 if count > 10:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1080 break
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1081 if count > 0:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1082 return True
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1083 except:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1084 pass
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1085 finally:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1086 fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1087 return False
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1088
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1089 class ConsensusTaxonomy(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1090 file_ext = 'cons.taxonomy'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1091 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1092 """A list of names"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1093 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1094 self.column_names = ['OTU','count','taxonomy']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1095
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1096 class TaxonomySummary(Tabular):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1097 file_ext = 'tax.summary'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1098 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1099 """A Summary of taxon classification"""
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1100 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1101 self.column_names = ['taxlevel','rankID','taxon','daughterlevels','total']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1102
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1103 class Phylip(Text):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1104 file_ext = 'phy'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1105
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1106 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1107 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1108 Determines whether the file is in Phylip format (Interleaved or Sequential)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1109 The first line of the input file contains the number of species and the
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1110 number of characters, in free format, separated by blanks (not by
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1111 commas). The information for each species follows, starting with a
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1112 ten-character species name (which can include punctuation marks and blanks),
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1113 and continuing with the characters for that species.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1114 http://evolution.genetics.washington.edu/phylip/doc/main.html#inputfiles
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1115 Interleaved Example:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1116 6 39
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1117 Archaeopt CGATGCTTAC CGCCGATGCT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1118 HesperorniCGTTACTCGT TGTCGTTACT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1119 BaluchitheTAATGTTAAT TGTTAATGTT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1120 B. virginiTAATGTTCGT TGTTAATGTT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1121 BrontosaurCAAAACCCAT CATCAAAACC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1122 B.subtilisGGCAGCCAAT CACGGCAGCC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1123
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1124 TACCGCCGAT GCTTACCGC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1125 CGTTGTCGTT ACTCGTTGT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1126 AATTGTTAAT GTTAATTGT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1127 CGTTGTTAAT GTTCGTTGT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1128 CATCATCAAA ACCCATCAT
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1129 AATCACGGCA GCCAATCAC
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1130 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1131 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1132 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1133 # counts line
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1134 line = fh.readline().strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1135 linePieces = line.split()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1136 count = int(linePieces[0])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1137 seq_len = int(linePieces[1])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1138 # data lines
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1139 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1140 TODO check data lines
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1141 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1142 line = fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1143 # name is the first 10 characters
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1144 name = line[0:10]
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1145 seq = line[10:].strip()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1146 # nucleic base or amino acid 1-char designators (spaces allowed)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1147 bases = ''.join(seq.split())
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1148 # float per base (each separated by space)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1149 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1150 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1151 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1152 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1153 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1154 close(fh)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1155 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1156
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1157
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1158 class Axes(Tabular):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1159 file_ext = 'axes'
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1160
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1161 def __init__(self, **kwd):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1162 """Initialize axes datatype"""
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1163 Tabular.__init__( self, **kwd )
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1164 def sniff( self, filename ):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1165 """
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1166 Determines whether the file is an axes format
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1167 The first line may have column headings.
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1168 The following lines have the name in the first column plus float columns for each axis.
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1169 ==> 98_sq_phylip_amazon.fn.unique.pca.axes <==
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1170 group axis1 axis2
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1171 forest 0.000000 0.145743
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1172 pasture 0.145743 0.000000
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1173
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1174 ==> 98_sq_phylip_amazon.nmds.axes <==
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1175 axis1 axis2
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1176 U68589 0.262608 -0.077498
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1177 U68590 0.027118 0.195197
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1178 U68591 0.329854 0.014395
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1179 """
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1180 try:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1181 fh = open( filename )
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1182 count = 0
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1183 line = fh.readline()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1184 line = line.strip()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1185 col_cnt = None
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1186 all_integers = True
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1187 while True:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1188 line = fh.readline()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1189 line = line.strip()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1190 if not line:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1191 break #EOF
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1192 if line:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1193 fields = line.split('\t')
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1194 if col_cnt == None: # ignore values in first line as they may be column headings
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1195 col_cnt = len(fields)
29
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
1196 # There should be at least 2 columns
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
1197 if col_cnt < 2:
9c0cd3b92295 Fixes for metagenomics.py datatypes tahnks to Peter Briggs
Jim Johnson <jj@umn.edu>
parents: 27
diff changeset
1198 return False
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1199 else:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1200 if len(fields) != col_cnt :
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1201 return False
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1202 try:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1203 for i in range(1, col_cnt):
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1204 check = float(fields[i])
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1205 # Also test for whether value is an integer
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1206 try:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1207 check = int(fields[i])
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1208 except ValueError:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1209 all_integers = False
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1210 except ValueError:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1211 return False
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1212 count += 1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1213 if count > 10:
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1214 break
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1215 if count > 0:
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1216 if not all_integers:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1217 # At least one value was a float
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1218 return True
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1219 else:
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1220 return False
1
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1221 except:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1222 pass
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1223 finally:
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1224 fh.close()
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1225 return False
fcc0778f6987 Migrated tool version 1.16.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 0
diff changeset
1226
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1227 class SffFlow(Tabular):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1228 MetadataElement( name="flow_values", default="", no_value="", optional=True , desc="Total number of flow values", readonly=True)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1229 MetadataElement( name="flow_order", default="TACG", no_value="TACG", desc="Total number of flow values", readonly=False)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1230 file_ext = 'sff.flow'
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1231 """
30
a90d1915a176 metagenomics.py - require ref.taxonomy sniff to find at least 1 multi-level tax assignment with semicolon separators
Jim Johnson <jj@umn.edu>
parents: 29
diff changeset
1232 # http://www.mothur.org/wiki/Flow_file
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1233 The first line is the total number of flow values - 800 for Titanium data. For GS FLX it would be 400.
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1234 Following lines contain:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1235 - SequenceName
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1236 - the number of useable flows as defined by 454's software
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1237 - the flow intensity for each base going in the order of TACG.
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1238 Example:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1239 800
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1240 GQY1XT001CQL4K 85 1.04 0.00 1.00 0.02 0.03 1.02 0.05 ...
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1241 GQY1XT001CQIRF 84 1.02 0.06 0.98 0.06 0.09 1.05 0.07 ...
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1242 GQY1XT001CF5YW 88 1.02 0.02 1.01 0.04 0.06 1.02 0.03 ...
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1243 """
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1244 def __init__(self, **kwd):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1245 Tabular.__init__( self, **kwd )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1246
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1247 def set_meta( self, dataset, overwrite = True, skip = 1, max_data_lines = None, **kwd ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1248 Tabular.set_meta(self, dataset, overwrite, 1, max_data_lines)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1249 try:
16
541e3c97c240 Mothur - fix set_meta for SffFlow in datatypes/metagenomics.py
Jim Johnson <jj@umn.edu>
parents: 15
diff changeset
1250 fh = open( dataset.file_name )
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1251 line = fh.readline()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1252 line = line.strip()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1253 flow_values = int(line)
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1254 dataset.metadata.flow_values = flow_values
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1255 finally:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1256 fh.close()
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1257
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1258 def make_html_table( self, dataset, skipchars=[] ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1259 """Create HTML table, used for displaying peek"""
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1260 out = ['<table cellspacing="0" cellpadding="3">']
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1261 comments = []
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1262 try:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1263 # Generate column header
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1264 out.append('<tr>')
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1265 out.append( '<th>%d. Name</th>' % 1 )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1266 out.append( '<th>%d. Flows</th>' % 2 )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1267 for i in range( 3, dataset.metadata.columns+1 ):
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1268 base = dataset.metadata.flow_order[(i+1)%4]
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1269 out.append( '<th>%d. %d %s</th>' % (i-2,base) )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1270 out.append('</tr>')
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1271 out.append( self.make_html_peek_rows( dataset, skipchars=skipchars ) )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1272 out.append( '</table>' )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1273 out = "".join( out )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1274 except Exception, exc:
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1275 out = "Can't create peek %s" % str( exc )
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1276 return out
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1277
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1278 class Newick( Text ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1279 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1280 The Newick Standard for representing trees in computer-readable form makes use of the correspondence between trees and nested parentheses.
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1281 http://evolution.genetics.washington.edu/phylip/newicktree.html
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1282 http://en.wikipedia.org/wiki/Newick_format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1283 Example:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1284 (B,(A,C,E),D);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1285 or example with branch lengths:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1286 (B:6.0,(A:5.0,C:3.0,E:4.0):5.0,D:11.0);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1287 or an example with embedded comments but no branch lengths:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1288 ((a [&&PRIME S=x], b [&&PRIME S=y]), c [&&PRIME S=z]);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1289 Example with named interior noe:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1290 (B:6.0,(A:5.0,C:3.0,E:4.0)Ancestor1:5.0,D:11.0);
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1291 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1292 file_ext = 'tre'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1293
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1294 def __init__(self, **kwd):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1295 Text.__init__( self, **kwd )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1296
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1297 def sniff( self, filename ): ## TODO
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1298 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1299 Determine whether the file is in Newick format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1300 Note: Last non-space char of a tree should be a semicolon: ';'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1301 Usually the first char will be a open parenthesis: '('
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1302 (,,(,)); no nodes are named
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1303 (A,B,(C,D)); leaf nodes are named
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1304 (A,B,(C,D)E)F; all nodes are named
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1305 (:0.1,:0.2,(:0.3,:0.4):0.5); all but root node have a distance to parent
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1306 (:0.1,:0.2,(:0.3,:0.4):0.5):0.0; all have a distance to parent
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1307 (A:0.1,B:0.2,(C:0.3,D:0.4):0.5); distances and leaf names (popular)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1308 (A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F; distances and all names
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1309 ((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A; a tree rooted on a leaf node (rare)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1310 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1311 if not os.path.exists(filename):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1312 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1313 try:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1314 ## For now, guess this is a Newick file if it starts with a '(' and ends with a ';'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1315 flen = os.path.getsize(filename)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1316 fh = open( filename )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1317 len = min(flen,2000)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1318 # check end of the file for a semicolon
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1319 fh.seek(-len,os.SEEK_END)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1320 buf = fh.read(len).strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1321 buf = buf.strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1322 if not buf.endswith(';'):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1323 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1324 # See if this starts with a open parenthesis
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1325 if len < flen:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1326 fh.seek(0)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1327 buf = fh.read(len).strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1328 if buf.startswith('('):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1329 return True
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1330 except:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1331 pass
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1332 finally:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1333 close(fh)
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1334 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1335
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1336 class Nhx( Newick ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1337 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1338 New Hampshire eXtended Newick with embedded
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1339 The Newick Standard for representing trees in computer-readable form makes use of the correspondence between trees and nested parentheses.
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1340 http://evolution.genetics.washington.edu/phylip/newicktree.html
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1341 http://en.wikipedia.org/wiki/Newick_format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1342 Example:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1343 (gene1_Hu[&&NHX:S=Hu_Homo_sapiens], (gene2_Hu[&&NHX:S=Hu_Homo_sapiens], gene2_Mu[&&NHX:S=Mu_Mus_musculus]));
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1344 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1345 file_ext = 'nhx'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1346
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1347 class Nexus( Text ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1348 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1349 http://en.wikipedia.org/wiki/Nexus_file
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1350 Example:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1351 #NEXUS
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1352 BEGIN TAXA;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1353 Dimensions NTax=4;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1354 TaxLabels fish frog snake mouse;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1355 END;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1356
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1357 BEGIN CHARACTERS;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1358 Dimensions NChar=20;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1359 Format DataType=DNA;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1360 Matrix
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1361 fish ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1362 frog ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1363 snake ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1364 mouse ACATA GAGGG TACCT CTAAG
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1365 END;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1366
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1367 BEGIN TREES;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1368 Tree best=(fish, (frog, (snake, mouse)));
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1369 END;
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1370 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1371 file_ext = 'nex'
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1372
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1373 def __init__(self, **kwd):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1374 Text.__init__( self, **kwd )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1375
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1376 def sniff( self, filename ):
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1377 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1378 Determines whether the file is in nexus format
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1379 First line should be:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1380 #NEXUS
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1381 """
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1382 try:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1383 fh = open( filename )
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1384 count = 0
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1385 line = fh.readline()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1386 line = line.strip()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1387 if line and line == '#NEXUS':
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1388 fh.close()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1389 return True
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1390 except:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1391 pass
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1392 finally:
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1393 fh.close()
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1394 return False
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1395
15
a6189f58fedb Mothur - updated for Mothur version 1.22.0
Jim Johnson <jj@umn.edu>
parents: 7
diff changeset
1396
26
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1397 ## Biom
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1398
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1399 class BiologicalObservationMatrix( Text ):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1400 file_ext = 'biom'
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1401 """
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1402 http://biom-format.org/documentation/biom_format.html
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1403 The format of the file is JSON:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1404 {
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1405 "id":null,
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1406 "format": "Biological Observation Matrix 0.9.1-dev",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1407 "format_url": "http://biom-format.org",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1408 "type": "OTU table",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1409 "generated_by": "QIIME revision 1.4.0-dev",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1410 "date": "2011-12-19T19:00:00",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1411 "rows":[
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1412 {"id":"GG_OTU_1", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1413 {"id":"GG_OTU_2", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1414 {"id":"GG_OTU_3", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1415 ],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1416 "columns": [
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1417 {"id":"Sample1", "metadata":null},
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1418 {"id":"Sample2", "metadata":null}
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1419 ],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1420 "matrix_type": "sparse",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1421 "matrix_element_type": "int",
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1422 "shape": [3, 2],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1423 "data":[[0,1,1],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1424 [1,0,5],
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1425 [2,1,4]
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1426 ]
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1427 }
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1428
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1429 """
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1430
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1431 def __init__(self, **kwd):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1432 Text.__init__( self, **kwd )
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1433
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1434 def sniff( self, filename ):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1435 if os.path.getsize(filename) < 50000:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1436 try:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1437 data = simplejson.load(open(filename))
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1438 if data['format'].find('Biological Observation Matrix'):
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1439 return True
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1440 except:
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1441 pass
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1442 return False
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1443
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1444
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1445
5c77423823cb Updates for Mothur version 1.25.0 (includes changes to datatypes metagenomics.py and uses more efficient means for labels and groups options)
Jim Johnson <jj@umn.edu>
parents: 25
diff changeset
1446
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1447 ## Qiime Classes
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1448
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1449 class QiimeMetadataMapping(Tabular):
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1450 MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] )
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1451 file_ext = 'qiimemapping'
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1452
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1453 def __init__(self, **kwd):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1454 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1455 http://qiime.sourceforge.net/documentation/file_formats.html#mapping-file-overview
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1456 Information about the samples necessary to perform the data analysis.
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1457 # self.column_names = ['#SampleID','BarcodeSequence','LinkerPrimerSequence','Description']
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1458 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1459 Tabular.__init__( self, **kwd )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1460
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1461 def sniff( self, filename ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1462 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1463 Determines whether the file is a qiime mapping file
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1464 Just checking for an appropriate header line for now, could be improved
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1465 """
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1466 try:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1467 pat = '#SampleID(\t[a-zA-Z][a-zA-Z0-9_]*)*\tDescription'
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1468 fh = open( filename )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1469 while True:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1470 line = dataset_fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1471 if re.match(pat,line):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1472 return True
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1473 except:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1474 pass
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1475 finally:
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1476 close(fh)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1477 return False
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1478
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1479 def set_column_names(self, dataset):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1480 if dataset.has_data():
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1481 dataset_fh = open( dataset.file_name )
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1482 line = dataset_fh.readline()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1483 if line.startswith('#SampleID'):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1484 dataset.metadata.column_names = line.strip().split('\t');
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1485 dataset_fh.close()
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1486
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1487 def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ):
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1488 Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1489 self.set_column_names(dataset)
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1490
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1491 class QiimeOTU(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1492 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1493 Associates OTUs with sequence IDs
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1494 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1495 0 FLP3FBN01C2MYD FLP3FBN01B2ALM
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1496 1 FLP3FBN01DF6NE FLP3FBN01CKW1J FLP3FBN01CHVM4
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1497 2 FLP3FBN01AXQ2Z
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1498 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1499 file_ext = 'qiimeotu'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1500
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1501 class QiimeOTUTable(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1502 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1503 #Full OTU Counts
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1504 #OTU ID PC.354 PC.355 PC.356 Consensus Lineage
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1505 0 0 1 0 Root;Bacteria;Firmicutes;"Clostridia";Clostridiales
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1506 1 1 3 1 Root;Bacteria
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1507 2 0 2 2 Root;Bacteria;Bacteroidetes
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1508 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1509 MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1510 file_ext = 'qiimeotutable'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1511 def init_meta( self, dataset, copy_from=None ):
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
1512 Tabular.init_meta( self, dataset, copy_from=copy_from )
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1513 def set_meta( self, dataset, overwrite = True, skip = None, **kwd ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1514 self.set_column_names(dataset)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1515 def set_column_names(self, dataset):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1516 if dataset.has_data():
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1517 dataset_fh = open( dataset.file_name )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1518 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1519 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1520 if line.startswith('#OTU ID'):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1521 dataset.metadata.column_names = line.strip().split('\t');
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1522 dataset_fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1523 dataset.metadata.comment_lines = 2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1524
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1525 class QiimeDistanceMatrix(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1526 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1527 PC.354 PC.355 PC.356
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1528 PC.354 0.0 3.177 1.955
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1529 PC.355 3.177 0.0 3.444
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1530 PC.356 1.955 3.444 0.0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1531 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1532 file_ext = 'qiimedistmat'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1533 def init_meta( self, dataset, copy_from=None ):
27
49058b1f8d3f Update to mothur version 1.27 and add tool_dependencies.xml to automatically install mothur
Jim Johnson <jj@umn.edu>
parents: 26
diff changeset
1534 Tabular.init_meta( self, dataset, copy_from=copy_from )
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1535 def set_meta( self, dataset, overwrite = True, skip = None, **kwd ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1536 self.set_column_names(dataset)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1537 def set_column_names(self, dataset):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1538 if dataset.has_data():
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1539 dataset_fh = open( dataset.file_name )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1540 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1541 # first line contains the names
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1542 dataset.metadata.column_names = line.strip().split('\t');
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1543 dataset_fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1544 dataset.metadata.comment_lines = 1
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1545
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1546 class QiimePCA(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1547 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1548 Principal Coordinate Analysis Data
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1549 The principal coordinate (PC) axes (columns) for each sample (rows).
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1550 Pairs of PCs can then be graphed to view the relationships between samples.
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1551 The bottom of the output file contains the eigenvalues and % variation explained for each PC.
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1552 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1553 pc vector number 1 2 3
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1554 PC.354 -0.309063936588 0.0398252112257 0.0744672231759
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1555 PC.355 -0.106593922619 0.141125998277 0.0780204374172
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1556 PC.356 -0.219869362955 0.00917241121781 0.0357281314115
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1557
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1558
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1559 eigvals 0.480220500471 0.163567082874 0.125594470811
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1560 % variation explained 51.6955484555 17.6079322939
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1561 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1562 file_ext = 'qiimepca'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1563
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1564 class QiimeParams(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1565 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1566 ###pick_otus_through_otu_table.py parameters###
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1567
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1568 # OTU picker parameters
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1569 pick_otus:otu_picking_method uclust
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1570 pick_otus:clustering_algorithm furthest
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1571
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1572 # Representative set picker parameters
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1573 pick_rep_set:rep_set_picking_method first
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1574 pick_rep_set:sort_by otu
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1575 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1576 file_ext = 'qiimeparams'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1577
17
57df76d861e4 Modifications for ToolShed proprietary data types
Jim Johnson <jj@umn.edu>
parents: 16
diff changeset
1578 class QiimePrefs(Text):
2
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1579 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1580 A text file, containing coloring preferences to be used by make_distance_histograms.py, make_2d_plots.py and make_3d_plots.py.
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1581 Example:
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1582 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1583 'background_color':'black',
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1584
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1585 'sample_coloring':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1586 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1587 'Treatment':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1588 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1589 'column':'Treatment',
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1590 'colors':(('red',(0,100,100)),('blue',(240,100,100)))
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1591 },
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1592 'DOB':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1593 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1594 'column':'DOB',
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1595 'colors':(('red',(0,100,100)),('blue',(240,100,100)))
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1596 }
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1597 },
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1598 'MONTE_CARLO_GROUP_DISTANCES':
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1599 {
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1600 'Treatment': 10,
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1601 'DOB': 10
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1602 }
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1603 }
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1604 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1605 file_ext = 'qiimeprefs'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1606
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1607 class QiimeTaxaSummary(Tabular):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1608 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1609 Taxon PC.354 PC.355 PC.356
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1610 Root;Bacteria;Actinobacteria 0.0 0.177 0.955
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1611 Root;Bacteria;Firmicutes 0.177 0.0 0.444
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1612 Root;Bacteria;Proteobacteria 0.955 0.444 0.0
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1613 """
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1614 MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1615 file_ext = 'qiimetaxsummary'
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1616
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1617 def set_column_names(self, dataset):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1618 if dataset.has_data():
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1619 dataset_fh = open( dataset.file_name )
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1620 line = dataset_fh.readline()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1621 if line.startswith('Taxon'):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1622 dataset.metadata.column_names = line.strip().split('\t');
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1623 dataset_fh.close()
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1624
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1625 def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ):
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1626 Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1627 self.set_column_names(dataset)
e990ac8a0f58 Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository
jjohnson
parents: 1
diff changeset
1628
0
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1629 if __name__ == '__main__':
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1630 import doctest, sys
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1631 doctest.testmod(sys.modules[__name__])
3202a38e44d9 Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
jjohnson
parents:
diff changeset
1632