diff check_samcomp_for_lost_reads.xml @ 0:e979cb57a5d5 draft default tip

"planemo upload for repository https://github.com/McIntyre-Lab/BayesASE/tree/main/galaxy commit 9b70598ef46a73632d9e0fa0c6ce6776fb5e9d6a"
author malex
date Thu, 14 Jan 2021 21:51:36 +0000
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/check_samcomp_for_lost_reads.xml	Thu Jan 14 21:51:36 2021 +0000
@@ -0,0 +1,114 @@
+<tool id="check_samcomp_for_lost_reads" name="Check SAM Compare Output" version="21.1.13">
+    <description> - check read numbers in and out of Compare SAM Files and Create ASE Counts Tables tool</description>
+    <macros>
+      <import>macros.xml</import>
+    </macros>
+    <expand macro="requirements" />
+    <command><![CDATA[
+    check_samcomp_lost_reads.py
+    --summary1=$sum1
+    --summary2=$sum2
+    --ase_names=$ase.element_identifier
+    --ase=$ase
+    --out=$out
+]]></command>
+    <inputs>
+        <param name="sum1" type="data" format="tabular" label="Remove Reads Summary file for updated genome 1 (G1)" help="Select the summary file containing the read counts after dropping non-overlapping reads [Required]"/>
+        <param name="sum2" type="data" format="tabular" label="Remove Reads Summary file for updated genome 2 (G2)" help="Select the summary file containing the read counts after dropping non-overlapping reads [Required]"/>
+        <param name="ase" type="data" format="tsv" label="ASE Totals Table" help="Select the ASE Totals tables containing the read counts generated by the Compare SAM Files and Create ASE Counts Tables tool [Required]"/>
+    </inputs>
+    <outputs>
+        <data name="out" format="tabular" label="${tool.name} on ${on_string}: Check SAM Compare Output"/>
+    </outputs>
+    <tests>
+        <test>
+            <param name="sum1" ftype="data"      value="align_and_counts_test_data/number_rows_left_after_removed_reads.tabular"/>
+            <param name="sum2" ftype="data"      value="align_and_counts_test_data/number_rows_left_after_removed_reads_G2.tabular"/>
+            <param name="ase"  ftype="data"      value="align_and_counts_test_data/ASE_totals_table_BASE_test_data.tsv" />
+            <output name="out"     file="align_and_counts_test_data/check_SAM_compare_for_lost_reads_BASE_test_data.tabular" />
+        </test>
+    </tests>
+    <help><![CDATA[
+**Tool Description**
+
+The Check SAM Compare Output tool checks that the number of reads into and out of the Compare SAM Files and Create ASE Counts Tables Tool are the same.
+The total of all reads mapped to each feature should be the sum of all unique reads mapped to that feature from the two initial alignment files.
+This implies that the total must be at least the number of reads mapping to one genome and no more than the sum of reads mapping to both genomes.
+Numbers of reads in the ASE Totals file outside this range inicate that the Compare SAM Files and Create ASE Counts Tables tools should be rerun.
+
+**This tool takes the following input:**
+
+        (1) Remove Reads Summary for G1 - the summary file generated from the Remove Reads tool containing the number of rows left after non-overlapping reads were removed for G1
+        (2) Remove Reads Summary for G2 - the summary file generated from the Remove Reads tool containing the number of rows left after non-overlapping reads were removed for G2
+        (3) ASE Totals Table - contains read counts generated by the SAM compare tool
+
+An example Remove Reads summary file:
+
+    +---------------+---------------------+---------------------------+
+    |   fqNa        |  number_overlapping_rows  | total_number_rows   |
+    +===============+===========================+=====================+
+    | dataset_2215  |   918                     |     919             |
+    +---------------+---------------------------+---------------------+
+
+
+An example of a ASE total file::
+
+        Count totals:
+    1:	a_single_exact	0
+    2:	a_single_inexact 0
+    3:	a_multi_exact	0
+    4:	a_multi_inexact	0
+    5:	b_single_exact	0
+    6:	b_single_inexact 0
+    7:	b_multi_exact	0
+    8:	b_multi_inexact	0
+    9:	both_single_exact_same	0
+    10:	both_single_exact_diff	6
+    11:	both_single_inexact_same  0
+    12:	both_single_inexact_diff  8
+    13:	both_inexact_diff_equal	5
+    14:	both_inexact_diff_a_better  1
+    15:	both_inexact_diff_b_better	2
+    16:	both_multi_exact	0
+    17:	both_multi_inexact	0
+    18:	a_single_exact_b_single_inexact	0
+    19:	a_single_inexact_b_single_exact	0
+    20:	a_single_exact_b_multi_exact	0
+    21:	a_multi_exact_b_single_exact	0
+    22:	a_single_exact_b_multi_inexact	0
+    23:	a_multi_inexact_b_single_exact	0
+    24:	a_single_inexact_b_multi_exact	0
+    25:	a_multi_exact_b_single_inexact	0
+    26:	a_single_inexact_b_multi_inexact 0
+    27:	a_multi_inexact_b_single_inexact 0
+    28:	a_multi_exact_b_multi_inexact	0
+    29:	a_multi_inexact_b_multi_exact	0
+    30:	total_count	14
+
+**This tool will output a tabular file containing the following columns:**
+
+        (1) fqName
+        (2) min_uniq_g1_uniq_g2: The minimum number of unique reads of the two BWA files
+        (3) sum_uniq_g1_uniq_g2: The sum of the unique reads in the two BWA files
+        (4) total_counts_ase_table: The final total count in the ASE totals file (should be between (2) and (3) doubled for the check to be successful)
+        (5) flag_readnum_in_range: A 0/1 indicator flag that is equal to 1 if the check was successful or 0 if the check was unsuccessful
+
+An example of an unsuccessful output file where reads were lost:
+
+    +---------------+---------------------+--------------------------------------+----------------------+----------------------+
+    |   fqName      |min_uniq_g1_uniq_g2  | sum_uniq_g1_uniq_g2                  |total_counts_ase_table| flag_readnum_in_range|
+    +===============+=====================+======================================+======================+======================+
+    | name_of_fq    |  14                 |   28                                 |    8                 |    0                 |
+    +---------------+---------------------+--------------------------------------+----------------------+----------------------+
+
+    ]]></help>
+
+    <citations>
+            <citation type="bibtex">@ARTICLE{Miller20BASE,
+            author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre},
+            title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)},
+            journal = {????},
+            year = {submitted for publication}
+            }</citation>
+        </citations>
+</tool>