annotate src/breadcrumbs/README.md @ 0:2f4f6f08c8c4 draft

Uploaded
author george-weingart
date Tue, 13 May 2014 21:58:57 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
1 # BreadCrumbs #
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
2
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
3 BreadCrumbs is an unofficial collection of scripts and code intended to consolidate functions for tool development and contain scripts for command line access to commonly used functions. Breadcrumbs tends to include functionality associated with metagenomics analysis but you never know what you will find!
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
4
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
5
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
6 ## Dependencies: ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
7
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
8 1. Cogent https://pypi.python.org/pypi/cogent
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
9 2. MatplotLib http://matplotlib.org/downloads.html
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
10 3. Mercurial http://mercurial.selenic.com/ (optional for downloading)
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
11 4. Numpy http://www.numpy.org/
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
12 5. Python 2.x http://www.python.org/download/
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
13 6. SciPy http://www.scipy.org/install.html
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
14 7. biom support http://biom-format.org/
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
15
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
16
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
17 ## How to download ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
18
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
19 To download BreadCrumbs from BitBucket use the command:
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
20
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
21 > hg clone https://bitbucket.org/timothyltickle/breadcrumbs
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
22
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
23 To update BreadCrumbs, in the BreadCrumbs directory use the 2 commands sequentially:
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
24
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
25 > hg pull
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
26 > hg update
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
27
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
28
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
29 ## Scripts: ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
30
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
31 Scripts are included to expose core functionality through the command line. Currently these scripts center on manipulating and visualizing abundance tables.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
32 A quick description of the scripts include:
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
33
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
34 * *Hclust.py* Flexible script to create a visualization of hierarchical clustering of abundance tables (or other matrices).
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
35
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
36 * *scriptBiplotTSV.R* Allows one to plot a tsv file as a biplot using nonmetric multidimensional scaling.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
37
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
38 * *scriptPlotFeature.py* Allows one to plot a histogram, boxplot, or scatter plot of a bug or metadata in an abundance table. Will work on any row in a matrix.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
39
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
40 * *scriptManipulateTable.py* Allows one to perform common functions on an abundance table including, summing, normalizing, filtering, stratifying tables.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
41
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
42 * *scriptPcoa.py* Allows one to plot a principle covariance analysis (PCoA) plot of an abundance table.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
43
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
44 * *scriptConvertBetweenBIOMAndPCL.py* Allows one to convert between BIOM and PCL file formats.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
45
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
46
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
47 ## Programming Classes: ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
48
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
49 Brief descriptions of classes are as follows. More detailed descriptions are given in the classes themselves.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
50
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
51 * *AbundanceTable* Data structure to contain and perform operations on an abundance table.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
52
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
53 * *BoxPlot* Wrapper to plot box plots.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
54
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
55 * *CClade* Helper object used in hierarchical summing and normalization
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
56
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
57 * *Cladogram* Object that manipulated an early dendrogram visualization. Deprecated, should use the GraPhlan visualization tool on bitbucket instead.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
58
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
59 * *CommandLine* Collection of code to work with command line. Deprecated. Should use sfle calls.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
60
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
61 * *ConstantsBreadCrumbs* Contains generic constants.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
62
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
63 * *ConstantsFiguresBreadCrumbs* Contains constants associated with formatting figures.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
64
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
65 * *KMedoids* Code from MLPY which performs KMedoids sample selection.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
66
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
67 * *MLPYDistanceAdaptor* Used to allow custom distance matrices to be used by KMedoids.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
68
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
69 * *Metric* Difference functions associated with distance and diversity metrics.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
70
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
71 * *PCoA* Functionality surrounding the plotting of a PCoA
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
72
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
73 * *PlotMatrix* Allows on to plot a matrix of numbers.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
74
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
75 * *SVM* Support Vector Machine associated scripts.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
76
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
77 * *Utility* Generic functions
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
78
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
79 * *UtilityMath* Generic math related functions
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
80
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
81 * *ValidateData* Collection of functions to validate data types when needed.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
82
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
83
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
84 ## Demo input files: ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
85
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
86 * *fastunifrac_Ley_et_al_NRM_2_sample_id_map.txt* Example Unifrac Id mapping file (source http://bmf2.colorado.edu/fastunifrac/tutorial.psp)
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
87
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
88 * *GreenGenesCore-May09.ref.tre* Example Greengenes core set reference for Unifrac demo (source http://bmf2.colorado.edu/fastunifrac/tutorial.psp)
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
89
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
90 * *Test.pcl* Example file / Test PCL file to run scripts on.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
91
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
92 * *Test.biom* Example file / Test BIOM file to run scripts on.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
93
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
94 * *Test_no_metadata.pcl* Example file / Test PCL file to run scripts on which does not have metadata.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
95
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
96 * *Test_no_metadata.biom* Example file / Test BIOM file to run scripts on which does not have metadata.
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
97
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
98 * *Test-biplot.tsv* Example file / Test file for the scriptBiplotTSV.R
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
99
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
100
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
101 ## Contributing Authors: ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
102 Timothy Tickle, George Weingart, Nicola Segata, Curtis Huttenhower
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
103
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
104
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
105 ## Contact: ##
2f4f6f08c8c4 Uploaded
george-weingart
parents:
diff changeset
106 Please feel free to contact ttickle@hsph.harvard.edu with questions.