comparison rank_pathways.xml @ 14:8ae67e9fb6ff

Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
author miller-lab
date Fri, 28 Sep 2012 11:35:56 -0400
parents
children d6b961721037
comparison
equal deleted inserted replaced
13:fdb4240fb565 14:8ae67e9fb6ff
1 <tool id="gd_calc_freq" name="Rank Pathways" version="1.0.0">
2 <description>: Assess the impact of gene sets on pathways</description>
3
4 <command interpreter="python">
5 #if str($output_format) == 'a'
6 calctfreq.py
7 #else if str($output_format) == 'b'
8 calclenchange.py
9 #end if
10 "--loc_file=${GALAXY_DATA_INDEX_DIR}/gd.rank.loc"
11 "--species=${input.metadata.dbkey}"
12 "--input=${input}"
13 "--output=${output}"
14 "--posKEGGclmn=${input.metadata.kegg_path}"
15 "--KEGGgeneposcolmn=${input.metadata.kegg_gene}"
16 </command>
17
18 <inputs>
19 <param name="input" type="data" format="gd_sap" label="Table">
20 <validator type="metadata" check="kegg_gene,kegg_path" message="Missing KEGG gene code column and/or KEGG pathway code/name column metadata. Click the pencil icon in the history item to edit/save the metadata attributes" />
21 </param>
22 <param name="output_format" type="select" label="Output format">
23 <option value="a" selected="true">ranked by percentage of genes affected</option>
24 <option value="b">ranked by change in length and number of paths</option>
25 </param>
26 </inputs>
27
28 <outputs>
29 <data name="output" format="tabular" />
30 </outputs>
31
32 <tests>
33 <test>
34 <param name="input" value="test_in/sample.gd_sap" ftype="gd_sap" />
35 <param name="output_format" value="a" />
36 <output name="output" file="test_out/rank_pathways/rank_pathways.tabular" />
37 </test>
38 </tests>
39
40 <help>
41
42 **What it does**
43
44 This tool produces a table ranking the pathways based on the percentage
45 of genes in an input dataset, out of the total in each pathway.
46 Alternatively, the tool ranks the pathways based on the change in
47 length and number of paths connecting sources and sinks. This change is
48 calculated between graphs representing pathways with and without excluding
49 the nodes that represent the genes in an input list. Sources are all
50 the nodes representing the initial reactants/products in the pathway.
51 Sinks are all the nodes representing the final reactants/products in
52 the pathway.
53
54 If pathways are ranked by percentage of genes affected, the output is
55 a tabular dataset with the following columns:
56
57 1. number of genes in the pathway present in the input dataset
58 2. percentage of the total genes in the pathway included in the input dataset
59 3. rank of the frequency (from high freq to low freq)
60 4. name of the pathway
61
62 If pathways are ranked by change in length and number of paths, the
63 output is a tabular dataset with the following columns:
64
65 1. change in the mean length of paths between sources and sinks
66 2. mean length of paths between sources and sinks in the pathway including the genes in the input dataset. If the pathway do not have sources/sinks, the length is assumed to be infinite (I)
67 3. mean length of paths between sources and sinks in the pathway excluding the genes in the input dataset. If the pathway do not have sources/sinks, the length is assumed to be infinite (I)
68 4. rank of the change in the mean length of paths between sources and sinks (from high change to low change)
69 5. change in the number of paths between sources and sinks
70 6. number of paths between sources and sinks in the pathway including the genes in the input dataset. If the pathway do not have sources/sinks, it is assumed to be a circuit (C)
71 7. number of paths between sources and sinks in the pathway excluding the genes in the input dataset. If the pathway do not have sources/sinks, it is assumed to be a circuit (C)
72 8. rank of the change in the number of paths between sources and sinks (from high change to low change)
73 9. name of the pathway
74
75 </help>
76 </tool>