annotate microsatpurity.xml @ 7:3c05abb4452e default tip

add missing files
author devteam@galaxyproject.org
date Wed, 22 Apr 2015 12:22:50 -0400
parents b27006b0a953
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
1 <tool id="microsatpurity" name="Select uninterrupted STRs" version="1.0.0">
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
2 <description> of a specific column</description>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
3 <command interpreter="python">microsatpurity.py $input $period $column_n > $output </command>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
4
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
5 <inputs>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
6 <param name="input" type="data" label="Select input" />
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
7 <param name="period" type="integer" label="motif size" value="1"/>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
8 <param name="column_n" type="integer" value="0" label="Select column that contains microsatellites of interest (0 = last column)" />
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
9 </inputs>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
10 <outputs>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
11 <data format="tabular" name="output" />
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
12
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
13 </outputs>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
14 <tests>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
15 <!-- Test data with valid values -->
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
16 <test>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
17 <param name="input" value="microsatpurity_in.txt"/>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
18 <param name="period" value="2"/>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
19 <param name="column_n" value="0"/>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
20 <output name="output" file="microsatpurity_out.txt"/>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
21 </test>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
22
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
23 </tests>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
24 <help>
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
25
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
26
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
27 .. class:: infomark
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
28
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
29 **What it does**
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
30
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
31 This tool is used to select only the uninterrupted STRs/microsatellites. Interrupted STRs (e.g. ATATATATAATATAT) or sequences of STRs with non-STR parts (e.g. ATATATATATG) will be removed.
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
32
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
33 As another application of this tool, specifically for STR-FM pipeline (profiling STRs in short read data), it can be used to avoid the cases where flanking bases were misread as STRs (sequencing errors). Thus, the remaining read profile will only reflect the variation of TR length from expansion/contraction.
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
34 For example, suppose that the sequence around an STR in the reference genome is AGCGACGaaaaaaGCGATCA. If we observe a read with sequence AGCGACGaaaaaaaaaaGCGATCA, we can indicate that this is an STR expansion. However, if we observe another read with sequence AGCGACGaaaaaaaCGATCA, this is likely a substitution of G to A. Such incidents can be removed with this tool.
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
35 You can use the tool **combine mapped flanking bases** to get the STRs in reference that correspond to sequence between mapped reads. If the user map these reads around the uninterrupted STRs in reference, the corresponding sequences between these pairs should be the uninterrupted STRs regardless of expansion/contraction of STRs in short read data. However, if the substitution of flanking base or if the fluorescent signal from the previous run make it look like substitution, the corresponding sequences in reference in between the pairs will not be uninterrupted STRs. Thus this tool can remove those cases and keep only STR expansion/contraction.
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
36
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
37
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
38 **Citation**
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
39
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
40 When you use this tool, please cite **Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate Typing of Short Tandem Repeats from Genome-wide Sequencing Data and its Applications, Genome Research**
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
41
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
42 **Input**
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
43
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
44 The input files can be any tab delimited file.
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
45
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
46 If this tool is used in STR-FM for STRs profiling, it should contains:
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
47
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
48 - Column 1 = STR location in reference chromosome
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
49 - Column 2 = STR location in reference start
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
50 - Column 3 = STR location in reference stop
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
51 - Column 4 = STR location in reference motif
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
52 - Column 5 = STR location in reference length
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
53 - Column 6 = STR location in reference motif size
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
54 - Column 7 = length of STR (bp)
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
55 - Column 8 = length of left flanking region (bp)
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
56 - Column 9 = length of right flanking region (bp)
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
57 - Column 10 = repeat motif (bp)
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
58 - Column 11 = hamming distance
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
59 - Column 12 = read name
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
60 - Column 13 = read sequence with soft masking of STR
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
61 - Column 14 = read quality (the same Phred score scale as input)
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
62 - Column 15 = read name (The same as column 12)
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
63 - Column 16 = chromosome
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
64 - Column 17 = left flanking region start
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
65 - Column 18 = left flanking region stop
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
66 - Column 19 = STR start as infer from pair-end
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
67 - Column 20 = STR stop as infer from pair-end
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
68 - Column 21 = right flanking region start
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
69 - Column 22 = right flanking region stop
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
70 - Column 23 = STR length in reference
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
71 - Column 24 = STR sequence in reference
0
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
72
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
73 **Output**
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
74
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
75 The same as input format.
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
76
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
77
20ab85af9505 Uploaded
arkarachai-fungtammasan
parents:
diff changeset
78 </help>
5
b27006b0a953 update to latest version
devteam@galaxyproject.org
parents: 0
diff changeset
79 </tool>