Mercurial > repos > malex > bayesase
diff check_samcomp_for_lost_reads.xml @ 0:e979cb57a5d5 draft default tip
"planemo upload for repository https://github.com/McIntyre-Lab/BayesASE/tree/main/galaxy commit 9b70598ef46a73632d9e0fa0c6ce6776fb5e9d6a"
author | malex |
---|---|
date | Thu, 14 Jan 2021 21:51:36 +0000 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/check_samcomp_for_lost_reads.xml Thu Jan 14 21:51:36 2021 +0000 @@ -0,0 +1,114 @@ +<tool id="check_samcomp_for_lost_reads" name="Check SAM Compare Output" version="21.1.13"> + <description> - check read numbers in and out of Compare SAM Files and Create ASE Counts Tables tool</description> + <macros> + <import>macros.xml</import> + </macros> + <expand macro="requirements" /> + <command><![CDATA[ + check_samcomp_lost_reads.py + --summary1=$sum1 + --summary2=$sum2 + --ase_names=$ase.element_identifier + --ase=$ase + --out=$out +]]></command> + <inputs> + <param name="sum1" type="data" format="tabular" label="Remove Reads Summary file for updated genome 1 (G1)" help="Select the summary file containing the read counts after dropping non-overlapping reads [Required]"/> + <param name="sum2" type="data" format="tabular" label="Remove Reads Summary file for updated genome 2 (G2)" help="Select the summary file containing the read counts after dropping non-overlapping reads [Required]"/> + <param name="ase" type="data" format="tsv" label="ASE Totals Table" help="Select the ASE Totals tables containing the read counts generated by the Compare SAM Files and Create ASE Counts Tables tool [Required]"/> + </inputs> + <outputs> + <data name="out" format="tabular" label="${tool.name} on ${on_string}: Check SAM Compare Output"/> + </outputs> + <tests> + <test> + <param name="sum1" ftype="data" value="align_and_counts_test_data/number_rows_left_after_removed_reads.tabular"/> + <param name="sum2" ftype="data" value="align_and_counts_test_data/number_rows_left_after_removed_reads_G2.tabular"/> + <param name="ase" ftype="data" value="align_and_counts_test_data/ASE_totals_table_BASE_test_data.tsv" /> + <output name="out" file="align_and_counts_test_data/check_SAM_compare_for_lost_reads_BASE_test_data.tabular" /> + </test> + </tests> + <help><![CDATA[ +**Tool Description** + +The Check SAM Compare Output tool checks that the number of reads into and out of the Compare SAM Files and Create ASE Counts Tables Tool are the same. +The total of all reads mapped to each feature should be the sum of all unique reads mapped to that feature from the two initial alignment files. +This implies that the total must be at least the number of reads mapping to one genome and no more than the sum of reads mapping to both genomes. +Numbers of reads in the ASE Totals file outside this range inicate that the Compare SAM Files and Create ASE Counts Tables tools should be rerun. + +**This tool takes the following input:** + + (1) Remove Reads Summary for G1 - the summary file generated from the Remove Reads tool containing the number of rows left after non-overlapping reads were removed for G1 + (2) Remove Reads Summary for G2 - the summary file generated from the Remove Reads tool containing the number of rows left after non-overlapping reads were removed for G2 + (3) ASE Totals Table - contains read counts generated by the SAM compare tool + +An example Remove Reads summary file: + + +---------------+---------------------+---------------------------+ + | fqNa | number_overlapping_rows | total_number_rows | + +===============+===========================+=====================+ + | dataset_2215 | 918 | 919 | + +---------------+---------------------------+---------------------+ + + +An example of a ASE total file:: + + Count totals: + 1: a_single_exact 0 + 2: a_single_inexact 0 + 3: a_multi_exact 0 + 4: a_multi_inexact 0 + 5: b_single_exact 0 + 6: b_single_inexact 0 + 7: b_multi_exact 0 + 8: b_multi_inexact 0 + 9: both_single_exact_same 0 + 10: both_single_exact_diff 6 + 11: both_single_inexact_same 0 + 12: both_single_inexact_diff 8 + 13: both_inexact_diff_equal 5 + 14: both_inexact_diff_a_better 1 + 15: both_inexact_diff_b_better 2 + 16: both_multi_exact 0 + 17: both_multi_inexact 0 + 18: a_single_exact_b_single_inexact 0 + 19: a_single_inexact_b_single_exact 0 + 20: a_single_exact_b_multi_exact 0 + 21: a_multi_exact_b_single_exact 0 + 22: a_single_exact_b_multi_inexact 0 + 23: a_multi_inexact_b_single_exact 0 + 24: a_single_inexact_b_multi_exact 0 + 25: a_multi_exact_b_single_inexact 0 + 26: a_single_inexact_b_multi_inexact 0 + 27: a_multi_inexact_b_single_inexact 0 + 28: a_multi_exact_b_multi_inexact 0 + 29: a_multi_inexact_b_multi_exact 0 + 30: total_count 14 + +**This tool will output a tabular file containing the following columns:** + + (1) fqName + (2) min_uniq_g1_uniq_g2: The minimum number of unique reads of the two BWA files + (3) sum_uniq_g1_uniq_g2: The sum of the unique reads in the two BWA files + (4) total_counts_ase_table: The final total count in the ASE totals file (should be between (2) and (3) doubled for the check to be successful) + (5) flag_readnum_in_range: A 0/1 indicator flag that is equal to 1 if the check was successful or 0 if the check was unsuccessful + +An example of an unsuccessful output file where reads were lost: + + +---------------+---------------------+--------------------------------------+----------------------+----------------------+ + | fqName |min_uniq_g1_uniq_g2 | sum_uniq_g1_uniq_g2 |total_counts_ase_table| flag_readnum_in_range| + +===============+=====================+======================================+======================+======================+ + | name_of_fq | 14 | 28 | 8 | 0 | + +---------------+---------------------+--------------------------------------+----------------------+----------------------+ + + ]]></help> + + <citations> + <citation type="bibtex">@ARTICLE{Miller20BASE, + author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre}, + title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)}, + journal = {????}, + year = {submitted for publication} + }</citation> + </citations> +</tool>