comparison naivestates.xml @ 0:1fb6181c2c64 draft

"planemo upload for repository https://github.com/ohsu-comp-bio/naivestates commit 392f57d212a7499bf1d3e421112a32a56635bc67-dirty"
author perssond
date Fri, 12 Mar 2021 00:20:13 +0000
parents
children a62b0c62270e
comparison
equal deleted inserted replaced
-1:000000000000 0:1fb6181c2c64
1 <tool id="naivestates" name="naivestates" version="@VERSION@.2" profile="17.09">
2 <description> Inference of cell states using Naive Bayes</description>
3 <macros>
4 <import>macros.xml</import>
5 </macros>
6
7 <expand macro="requirements"/>
8 @VERSION_CMD@
9
10 <command detect_errors="exit_code"><![CDATA[
11
12 @CMD_BEGIN@
13 -i '$counts'
14
15 #if $markers
16 -m $markers
17 #end if
18
19 --mct $mct
20 -p $plots
21
22 #if $id
23 --id $id
24 #end if
25
26 --log $log
27
28 #if $sfx
29 --sfx $sfx
30 #end if
31
32 #if $umap
33 --umap
34 #end if
35 -o .
36
37 &&
38
39 mv *-states.csv states.csv;
40
41 #if $plots != "off"
42 mv plots/*-probs.${plots} plots/probs.${plots};
43 mv plots/*-summary.${plots} plots/summary.${plots};
44 mv plots/*-allfits.${plots} plots/allfits.${plots};
45 #end if
46
47 ]]></command>
48
49
50 <inputs>
51 <param name="counts" type="data" format="csv" label="Quantified Cell Matrix"/>
52 <param name="markers" type="data" format="txt" optional="true" label="Markers to model"/>
53 <param name="mct" type="data" format="csv" label="Marker-State Association Map"/>
54 <param name="plots" type="select" label="Generate plots showing the fit">
55 <option selected="true" value="png">png</option>
56 <option value="pdf">pdf</option>
57 <option value="off">off</option>
58 </param>
59 <param name="id" type="text" value="" label="Column name containing cell IDs"/>
60 <param name="log" type="select" label="Log Transform" help="Whether to apply a log transform">
61 <option selected="true" value="auto">auto</option>
62 <option value="yes">yes</option>
63 <option value="no">no</option>
64 </param>
65 <param name="sfx" type="text" value="_cellMask" optional="true" label="Common suffix" help="Common suffix on marker columns (e.g., _cellMask)"/>
66 <param name="umap" type="boolean" checked="true" label="Generate UMAP plots"/>
67 </inputs>
68
69 <outputs>
70 <data format="csv" name="states" from_work_dir="states.csv" label="${tool.name} on ${on_string}: States CSV"/>
71 <data format="png" name="probs-png" from_work_dir="plots/probs.png" label="${tool.name} on ${on_string}: Probabilities">
72 <filter>plots == 'png'</filter>
73 </data>
74 <data format="png" name="summary-png" from_work_dir="plots/summary.png" label="${tool.name} on ${on_string}: Summary">
75 <filter>plots == 'png'</filter>
76 </data>
77 <data format="png" name="allfits-png" from_work_dir="plots/allfits.png" label="${tool.name} on ${on_string}: AllFits">
78 <filter>plots == 'png'</filter>
79 </data>
80 <data format="pdf" name="probs-pdf" from_work_dir="plots/probs.pdf" label="${tool.name} on ${on_string}: Probabilities">
81 <filter>plots == 'pdf'</filter>
82 </data>
83 <data format="pdf" name="summary-pdf" from_work_dir="plots/summary.pdf" label="${tool.name} on ${on_string}: Summary">
84 <filter>plots == 'pdf'</filter>
85 </data>
86 <data format="pdf" name="allfits-pdf" from_work_dir="plots/allfits.pdf" label="${tool.name} on ${on_string}: AllFits">
87 <filter>plots == 'pdf'</filter>
88 </data>
89 </outputs>
90 <help><![CDATA[
91 naivestates - Inference of cell states using Naive Bayes
92 This work is supported by the NIH Grant 1U54CA225088: Systems Pharmacology of Therapeutic and Adverse Responses to Immune Checkpoint and Small Molecule Drugs and by the NCI grant 1U2CCA233262: Pre-cancer atlases of cutaneous and hematologic origin (PATCH Center).
93
94 Introduction
95 naivestates is a label-free, cluster-free tool for inferring cell types from quantified marker expression data, based on known marker <-> cell type associations. The tool is designed to be run as a Docker container, but can also be installed in a Conda environment or as an R package. naivestates expects as input information about marker expression on a per-cell basis, provided in .csv format. One of the columns must contain cell IDs. An example input file may look as follows:
96
97 CellID,KERATIN,FOXP3,SMA
98 1,64.18060200668896,193.00334448160535,303.5016722408027
99 2,54.850202429149796,151.19433198380565,176.3846153846154
100 3,63.94712643678161,210.43218390804597,483.9448275862069
101 4,142.01320132013203,227.85808580858085,420.76897689768975
102 5,56.66379310344828,197.01896551724138,343.7810344827586
103 6,69.97454545454545,187.59636363636363,267.9709090909091
104 7,67.57754010695187,185.63368983957218,351.7914438502674
105 8,64.012,190.02,349.348
106 9,56.9622641509434,159.79245283018867,236.43867924528303
107 ...
108 Installation
109 Download the container image
110 Pull the latest version with
111
112 docker pull labsyspharm/naivestates
113 Alternatively, you can pull a specific version, which is recommended to ensure reproducibility of your analyses. For example, v1.2.0 can be pulled with
114
115 docker pull labsyspharm/naivestates:1.2.0
116 Examine the tool usage instructions
117 docker run --rm labsyspharm/naivestates:1.2.0 /app/main.R -h
118 replacing 1.2.0 with the version you are working with. Omit :1.2.0 entirely if you pulled the latest version above. The flag --rm tells Docker to delete the container instance after it finishes displaying the help message.
119
120 Basic usage
121 At minimum, the tool requires an input file and the list of marker names:
122
123 docker run --rm -v /path/to/data/folder:/data labsyspharm/naivestates:1.2.0 \
124 /app/main.R -i /data/myfile.csv -m aSMA,CD45,panCK
125 where we can make a distinction between Docker-level arguments:
126
127 --rm once again cleans up the container instance after it finishes running the code
128 -v /path/to/data/folder:/data maps the local folder containing your data to /data inside the container
129 :1.2.0 specifies the container version that we pulled above
130 and tool-level arguments:
131
132 -i /data/myfile.csv specifies which data file to process
133 -m aSMA,CD45,panCK specifies the markers of interest (NOTE: comma-delimited, no spaces)
134 If there is a large number of markers, place their names in a standalone file markers.txt with one marker per line. Ensure that the file lives in /path/to/data/folder/ and modify the Docker call to use the new file:
135
136 docker run --rm -v /path/to/data/folder:/data labsyspharm/naivestates:1.2.0 \
137 /app/main.R -i /data/myfile.csv -m /data/markers.txt
138 Additional parameters
139 The following parameters are optional, but may be useful in certain scenarios:
140
141 --plots <off|pdf|png> - (default: off) Produces QC plots of individual marker fits and summary UMAP plots in .png or .pdf format.
142 --id - (default: CellID) Name of the column that contains cell IDs
143 --log <yes|no|auto> - (default: auto) When a log10 transformation should be applied prior to fitting the data. The tool will do this automatically if it detects large values. Use --log no to force the use of original, non-transformed values instead.
144 -o - (default: /data) Alternative output directory. (Note that any file written to a directory that wasn't mapped with docker -v will not persist when the container is destroyed.)
145 --mct - The tool has a basic marker -> cell type (mct) mapping in typemap.csv. More sophisticated mct mappings can be defined by creating a custom-map.csv file with two columns: Marker and State. Ensure that custom-map.csv is in /path/to/data/folder and point the tool at it with --mct (e.g., /app/main.R -i /data/myfile.csv --mct /data/custom-map.csv -m aSMA,CD45,panCK)
146 Alternative execution environments
147 Running in a Conda environment
148 If you are working in a computational environment that doesn't support Docker, the repository provides a Conda-based alternative. Ensure that conda is installed on your system, then 1) clone this repository, 2) instantiate the conda environment and 3) install the tool.
149
150 git clone https://github.com/labsyspharm/naivestates.git
151 cd naivestates
152 conda env create -f conda.yml
153 conda activate naivestates
154 R -s -e "devtools::install_github('labsyspharm/naivestates')"
155 The tool can now be used as above by running main.R:
156
157 ./main.R -h
158 ./main.R -i /path/to/datafile.csv -m aSMA,CD45,panCK
159 Running as an R package
160 The tool can also be installed as an R package directly from GitHub:
161
162 if( !require(devtools) ) install.packages("devtools")
163 devtools::install_github( "labsyspharm/naivestates" )
164 Example usage:
165
166 library( tidyverse )
167 library( naivestates )
168
169 # Load the original data
170 X <- read_csv( "datafile.csv" )
171
172 # Fit models to channels aSMA, CD45 and panCK
173 # Specify that cell IDs are in column CellID
174 GMM <- GMMfit( X, CellID, aSMA, CD45, panCK )
175
176 # Plot a fit to one of the markers
177 plotFit( GMM, "CD45" )
178
179 # Write out the results to results.csv
180 GMMreshape(GMM) %>% write_csv( "results.csv" )
181
182 OHSU Wrapper Repo: https://github.com/ohsu-comp-bio/naivestates
183 ]]></help>
184 <expand macro="citations" />
185 </tool>