comparison tools/regVariation/categorize_elements_satisfying_criteria.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:9071e359b9a3
1 <tool id="categorize_elements_satisfying_criteria" name="Categorize Elements" version="1.0.0">
2 <description>satisfying criteria</description>
3
4 <command interpreter="perl">
5 categorize_elements_satisfying_criteria.pl $inputFile1 $inputFile2 $outputFile1
6 </command>
7
8 <inputs>
9 <param format="tabular" name="inputFile1" type="data" label="Select file containing categories and their elements"/>
10 <param format="tabular" name="inputFile2" type="data" label="Select file containing criteria and elements data"/>
11 </inputs>
12
13 <outputs>
14 <data format="tabular" name="outputFile1"/>
15 </outputs>
16
17 <tests>
18 <test>
19 <param name="inputFile1" value="categories.tabular" ftype="tabular" />
20 <param name="inputFile2" value="criteria_elements_data.tabular" ftype="tabular" />
21 <output name="outputFile1" file="categorized_elements.tabular" />
22 </test>
23 </tests>
24
25
26 <help>
27
28 .. class:: infomark
29
30 **What it does**
31
32 The program takes as input a set of categories, such that each category contains many elements. It also takes a table relating elements with criteria, such that each element is assigned a number representing the number of times the element satisfies a certain criterion.
33
34 - The first input is a TABULAR format file, such that the left column represents the names of categories and, all other columns represent the names of elements in each category.
35 - The second input is a TABULAR format file relating elements with criteria, such that the first line represents the names of criteria and the left column represents the names of elements.
36 - The output is a TABULAR format file relating catergories with criteria, such that each categoy is assigned a number representing the total number of times its elements satisfies a certain criterion.. Each category is assigned as many numbers as criteria.
37
38
39 **Example**
40
41 Let the first input file be a group of motif categories as follows::
42
43 Deletion_Hotspots deletionHoptspot1 deletionHoptspot2 deletionHoptspot3
44 Dna_Pol_Pause_Frameshift dnaPolPauseFrameshift1 dnaPolPauseFrameshift2 dnaPolPauseFrameshift3 dnaPolPauseFrameshift4
45 Indel_Hotspots indelHotspot1
46 Insertion_Hotspots insertionHotspot1 insertionHotspot2
47 Topoisomerase_Cleavage_Sites topoisomeraseCleavageSite1 topoisomeraseCleavageSite2 topoisomeraseCleavageSite3
48
49
50 And let the second input file represent the number of times each motif occurs in a certain window size of indel flanking regions, as follows::
51
52 10bp 20bp 40bp
53 deletionHoptspot1 1 1 2
54 deletionHoptspot2 1 1 1
55 deletionHoptspot3 0 0 0
56 dnaPolPauseFrameshift1 1 1 1
57 dnaPolPauseFrameshift2 0 2 1
58 dnaPolPauseFrameshift3 0 0 0
59 dnaPolPauseFrameshift4 0 1 2
60 indelHotspot1 0 0 0
61 insertionHotspot1 0 0 1
62 insertionHotspot2 1 1 1
63 topoisomeraseCleavageSite1 1 1 1
64 topoisomeraseCleavageSite2 1 2 1
65 topoisomeraseCleavageSite3 0 0 2
66
67 Running the program will give the total number of times the motifs of each category occur in every window size of indel flanking regions::
68
69 10bp 20bp 40bp
70 Deletion_Hotspots 2 2 3
71 Dna_Pol_Pause_Frameshift 1 4 4
72 Indel_Hotspots 0 0 0
73 Insertion_Hotspots 1 1 2
74 Topoisomerase_Cleavage_Sites 2 3 4
75
76 </help>
77
78 </tool>