annotate tools/mytools/align2database.xml @ 1:cdcb0ce84a1b

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:45:15 -0500
parents 9071e359b9a3
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="align2database" name="align-to-database">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description> features </description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python"> align2database.py $query $database $output_coverage $output_standarderror $output_plot $minfeat $windowsize $anchor $span> $outlog </command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 <param name="query" type="data" format="interval" label="Query intervals" help= "keep it small (less than 1,000,000 lines)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 <param name="database" type="select" label="Feature database">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/mm9/feature_database" selected="true">All mm9 features (over 200)</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/mm9/annotation">Annotated mm9 features</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/mm9/CLIP">protein bound RNA (CLIP) mm9 </option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/mm9/conservedmiRNAseedsite">conserved miRNA target sites mm9 </option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/hg18/all-feature">Human ChIP hmChIP database hg18</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/hg18/gene-feature">Human gene features hg18</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 <option value="/Users/xuebing/galaxy-dist/tool-data/aligndb/hg19/conservedmiRNAseedsite">conserved miRNA target sites hg19 </option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 </param>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 <param name="anchor" label="Anchor to query features" help="default anchoring to database featuers" type="boolean" truevalue="query" falsevalue="database" checked="False"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 <param name="windowsize" size="10" type="integer" value="5000" label="Window size (-w)" help="will create new intervals of w bp flanking the original center. set to 0 will not change input interval size)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 <param name="minfeat" size="10" type="integer" value="100" label="Minimum number of query intervals hits" help="database features overlapping with too few query intervals are discarded"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 <param name="span" size="10" type="float" value="0.1" label="loess span: smoothing parameter" help="value less then 0.1 disables smoothing"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 <param name="outputlabel" size="80" type="text" label="Output label" value="test"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 <data format="txt" name="outlog" label="${outputlabel} (log)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 <data format="tabular" name="output_standarderror" label="${outputlabel} (standard error)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 <data format="tabular" name="output_coverage" label="${outputlabel} (coverage)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26 <data format="pdf" name="output_plot" label="${outputlabel} (plot)"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 **Example output**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 .. image:: ./static/operation_icons/align_multiple2.png
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 **What it does**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 This tool aligns a query interval set (such as ChIP peaks) to a database of features (such as other ChIP peaks or TSS/splice sites), calculates and plots the relative distance of database features to the query intervals. Currently two databases are available:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 -- **ChIP peaks** from 191 ChIP experiments (processed from hmChIP database, see individual peak/BED files in **Shared Data**)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 -- **Annotated gene features**, such as: TSS, TES, 5'ss, 3'ss, CDS start and end, miRNA seed matches, enhancers, CpG island, microsatellite, small RNA, poly A sites (3P-seq-tags), miRNA genes, and tRNA genes.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 Two output files are generated. One is the coverage/profile for each feature in the database that has a minimum overlap with the query set. The first two columns are feature name and the total number of overlapping intervals from the query. Column 3 to column 102 are coverage at each bin. The other file is an PDF file plotting both the heatmap for all features and the average coverage for each individual database feature.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46 **How it works**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48 For each interval/peak in the query file, a window (default 10,000bp) is created around the center of the interval and is divided into 100 bins. For each database feature set (such as Pol II peaks), the tool counts how many intervals in the database feature file overlap with each bin. The count is then averaged over all query intervals that have at least one hit in at least one bin. Overall the plotted 'average coverage' represnts the fraction of query features (only those with hits, number shown in individual plot title) that has database feature interval covering that bin. The extreme is when the database feature is the same as the query, then every query interval is covered at the center, the average coverage of the center bin will be 1.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50 The heatmap is scaled for each row before clustering.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 </tool>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54