comparison check_samcomp_for_lost_reads.xml @ 0:e979cb57a5d5 draft default tip

"planemo upload for repository https://github.com/McIntyre-Lab/BayesASE/tree/main/galaxy commit 9b70598ef46a73632d9e0fa0c6ce6776fb5e9d6a"
author malex
date Thu, 14 Jan 2021 21:51:36 +0000
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:e979cb57a5d5
1 <tool id="check_samcomp_for_lost_reads" name="Check SAM Compare Output" version="21.1.13">
2 <description> - check read numbers in and out of Compare SAM Files and Create ASE Counts Tables tool</description>
3 <macros>
4 <import>macros.xml</import>
5 </macros>
6 <expand macro="requirements" />
7 <command><![CDATA[
8 check_samcomp_lost_reads.py
9 --summary1=$sum1
10 --summary2=$sum2
11 --ase_names=$ase.element_identifier
12 --ase=$ase
13 --out=$out
14 ]]></command>
15 <inputs>
16 <param name="sum1" type="data" format="tabular" label="Remove Reads Summary file for updated genome 1 (G1)" help="Select the summary file containing the read counts after dropping non-overlapping reads [Required]"/>
17 <param name="sum2" type="data" format="tabular" label="Remove Reads Summary file for updated genome 2 (G2)" help="Select the summary file containing the read counts after dropping non-overlapping reads [Required]"/>
18 <param name="ase" type="data" format="tsv" label="ASE Totals Table" help="Select the ASE Totals tables containing the read counts generated by the Compare SAM Files and Create ASE Counts Tables tool [Required]"/>
19 </inputs>
20 <outputs>
21 <data name="out" format="tabular" label="${tool.name} on ${on_string}: Check SAM Compare Output"/>
22 </outputs>
23 <tests>
24 <test>
25 <param name="sum1" ftype="data" value="align_and_counts_test_data/number_rows_left_after_removed_reads.tabular"/>
26 <param name="sum2" ftype="data" value="align_and_counts_test_data/number_rows_left_after_removed_reads_G2.tabular"/>
27 <param name="ase" ftype="data" value="align_and_counts_test_data/ASE_totals_table_BASE_test_data.tsv" />
28 <output name="out" file="align_and_counts_test_data/check_SAM_compare_for_lost_reads_BASE_test_data.tabular" />
29 </test>
30 </tests>
31 <help><![CDATA[
32 **Tool Description**
33
34 The Check SAM Compare Output tool checks that the number of reads into and out of the Compare SAM Files and Create ASE Counts Tables Tool are the same.
35 The total of all reads mapped to each feature should be the sum of all unique reads mapped to that feature from the two initial alignment files.
36 This implies that the total must be at least the number of reads mapping to one genome and no more than the sum of reads mapping to both genomes.
37 Numbers of reads in the ASE Totals file outside this range inicate that the Compare SAM Files and Create ASE Counts Tables tools should be rerun.
38
39 **This tool takes the following input:**
40
41 (1) Remove Reads Summary for G1 - the summary file generated from the Remove Reads tool containing the number of rows left after non-overlapping reads were removed for G1
42 (2) Remove Reads Summary for G2 - the summary file generated from the Remove Reads tool containing the number of rows left after non-overlapping reads were removed for G2
43 (3) ASE Totals Table - contains read counts generated by the SAM compare tool
44
45 An example Remove Reads summary file:
46
47 +---------------+---------------------+---------------------------+
48 | fqNa | number_overlapping_rows | total_number_rows |
49 +===============+===========================+=====================+
50 | dataset_2215 | 918 | 919 |
51 +---------------+---------------------------+---------------------+
52
53
54 An example of a ASE total file::
55
56 Count totals:
57 1: a_single_exact 0
58 2: a_single_inexact 0
59 3: a_multi_exact 0
60 4: a_multi_inexact 0
61 5: b_single_exact 0
62 6: b_single_inexact 0
63 7: b_multi_exact 0
64 8: b_multi_inexact 0
65 9: both_single_exact_same 0
66 10: both_single_exact_diff 6
67 11: both_single_inexact_same 0
68 12: both_single_inexact_diff 8
69 13: both_inexact_diff_equal 5
70 14: both_inexact_diff_a_better 1
71 15: both_inexact_diff_b_better 2
72 16: both_multi_exact 0
73 17: both_multi_inexact 0
74 18: a_single_exact_b_single_inexact 0
75 19: a_single_inexact_b_single_exact 0
76 20: a_single_exact_b_multi_exact 0
77 21: a_multi_exact_b_single_exact 0
78 22: a_single_exact_b_multi_inexact 0
79 23: a_multi_inexact_b_single_exact 0
80 24: a_single_inexact_b_multi_exact 0
81 25: a_multi_exact_b_single_inexact 0
82 26: a_single_inexact_b_multi_inexact 0
83 27: a_multi_inexact_b_single_inexact 0
84 28: a_multi_exact_b_multi_inexact 0
85 29: a_multi_inexact_b_multi_exact 0
86 30: total_count 14
87
88 **This tool will output a tabular file containing the following columns:**
89
90 (1) fqName
91 (2) min_uniq_g1_uniq_g2: The minimum number of unique reads of the two BWA files
92 (3) sum_uniq_g1_uniq_g2: The sum of the unique reads in the two BWA files
93 (4) total_counts_ase_table: The final total count in the ASE totals file (should be between (2) and (3) doubled for the check to be successful)
94 (5) flag_readnum_in_range: A 0/1 indicator flag that is equal to 1 if the check was successful or 0 if the check was unsuccessful
95
96 An example of an unsuccessful output file where reads were lost:
97
98 +---------------+---------------------+--------------------------------------+----------------------+----------------------+
99 | fqName |min_uniq_g1_uniq_g2 | sum_uniq_g1_uniq_g2 |total_counts_ase_table| flag_readnum_in_range|
100 +===============+=====================+======================================+======================+======================+
101 | name_of_fq | 14 | 28 | 8 | 0 |
102 +---------------+---------------------+--------------------------------------+----------------------+----------------------+
103
104 ]]></help>
105
106 <citations>
107 <citation type="bibtex">@ARTICLE{Miller20BASE,
108 author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre},
109 title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)},
110 journal = {????},
111 year = {submitted for publication}
112 }</citation>
113 </citations>
114 </tool>