Mercurial > repos > malex > bayesase
comparison check_for_lost_reads.xml @ 0:e979cb57a5d5 draft default tip
"planemo upload for repository https://github.com/McIntyre-Lab/BayesASE/tree/main/galaxy commit 9b70598ef46a73632d9e0fa0c6ce6776fb5e9d6a"
author | malex |
---|---|
date | Thu, 14 Jan 2021 21:51:36 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:e979cb57a5d5 |
---|---|
1 <tool id="check_for_lost_reads" name="Check for lost reads" version="21.1.13"> | |
2 <description>verify starting FASTQ read number equals read number after running BWASplitSAM tool</description> | |
3 <macros> | |
4 <import>macros.xml</import> | |
5 </macros> | |
6 <expand macro="requirements"/> | |
7 <command><![CDATA[ | |
8 check_lost_reads.py | |
9 --alnSum1=$alnSum1 | |
10 --alnSum2=$alnSum2 | |
11 --fq=$fq | |
12 --out=$out | |
13 ]]></command> | |
14 <inputs> | |
15 <param name="alnSum1" type="data" format="tabular" label="BWASplitSAM Alignment Summary G1" help="The G1 alignment summary file [from BWASplitSAM tool] for updated genome1 containing all read types [Required]"/> | |
16 <param name="alnSum2" type="data" format="tabular" label="BWASplitSAM Alignment Summary G2" help="The G2 alignment summary file [from BWASplitSAMtool] for updated genome2 containing all read types [Required]"/> | |
17 <param name="fq" type="data" format="fastq" label="Name of the FASTQ file" help="Name of FASTQ file used to generate the alignments selected above."/> | |
18 </inputs> | |
19 <outputs> | |
20 <data format="tabular" name="out" label="${tool.name} on ${on_string}: Check start readNum = alignment readNum"/> | |
21 </outputs> | |
22 <tests> | |
23 <test> | |
24 <param name="alnSum1" ftype="data" value="align_and_counts_test_data/W1118_G1_BWASplitSAM_summary.tabular"/> | |
25 <param name="alnSum2" ftype="data" value="align_and_counts_test_data/W55_G2_BWASplitSAM_summary.tabular"/> | |
26 <param name="fq" ftype="data" value="align_and_counts_test_data/W55_M_1_1.fastq"/> | |
27 <output name="out" file="align_and_counts_test_data/check_for_lost_reads_BASE_test_data.tabular" /> | |
28 </test> | |
29 </tests> | |
30 <help><![CDATA[ | |
31 **Tool Description** | |
32 | |
33 This tool checks that all reads in the starting FASTQ file are accounted for in the G1 and G2 SAM files after running the BWASplitSAM tool. | |
34 The reads counts in the input FASTQ file are compared to the 'count_total_reads' column in the summary of aligned reads TSV files generated byt he BWASplitSAM tool. | |
35 | |
36 | |
37 **Input** | |
38 -The tool requires three input files | |
39 | |
40 (1) The output summary TSV file generated from the BWASplitSAM tool for the updated genome1 (G1) SAM file | |
41 (2) The output summary TSV file generated from the BWASplitSAM tool for the updated genome2 (G2) SAM file | |
42 (3) The FASTQ file using to generate the above G1 and G2 SAM files - used to calculate the number of starting reads | |
43 | |
44 Example summary TSV file from BWASplitSAM script: | |
45 | |
46 +---------------+---------------------+---------------------------------------+---------------------+---------------------+----------------------+---------------------+-----------------+ | |
47 | Name | count_total_reads | count_mapped_read_opposite_strand | count_unmapped_read | count_mapped_read | count_ambiguous_read |count_chimeric_read | count_notprimary| | |
48 +===============+=====================+=======================================+=====================+=====================+======================+=====================+=================+ | |
49 | dataset_2216 | 14 | 5 | 0 | 9 |0 | 0 | 0 | | |
50 +---------------+---------------------+---------------------------------------+---------------------+---------------------+----------------------+---------------------+-----------------+ | |
51 | |
52 | |
53 **Output** | |
54 | |
55 A TSV file containing: | |
56 (1) starting read counts in the FASTQ file [start_read_num] | |
57 (2) read counts in the G1 alignment [readNum_G1] | |
58 (3) read counts in the G2 alignment [readNum_G2] | |
59 (4) indicator flag for whether the starting count = G1 count [flag_start_readNum_eq_readNum_G1] | |
60 (5) indicator flag for whether the starting count = G2 count [flag_start_readNum_eq_readNum_G2] | |
61 | |
62 Sample Output TSV file | |
63 | |
64 +---------------+---------------------+---------------+------------+------------------------------------+------------------------------------+ | |
65 | fqName | start_read_num | readNum_G1 | readNum_G2 | flag_start_readNum_eq_readNum_G1 | flag_start_readNum_eq_readNum_G2 | | |
66 +===============+=====================+===============+============+====================================+====================================+ | |
67 | dataset_2216 | 14 | 14 | 14 | 1 |1 | | |
68 +---------------+---------------------+---------------+------------+------------------------------------+------------------------------------+ | |
69 | |
70 Columns are:: | |
71 | |
72 ◦ FqName | |
73 ◦ start_read_num: The total number of reads in the FASTQ file | |
74 ◦ readNum_G1: The total number of reads in the summary TSV file output from BWASplitSAM for updated parental genome 1 (G1) | |
75 ◦ readNum_G2: The number of reads found in the summary TSV file output from BWASplitSAM for updated parental genome 2 (G2) | |
76 ◦ flag_start_readNum_eq_readNum_{G1/G2}: 0/1 indicator flag where “1” means that the number of reads in the FASTQ file matches the total read number in the G1 or G2 BWASplitSAM summary file. | |
77 | |
78 In the above example, flag_start_readNum_eq_readNum_G1 and flag_start_readNum_eq_readNum_G2 are both 1, indicating all reads are accounted for. | |
79 | |
80 The BayesASE align and count workflow should be rerun if flag_start_readNum_eq_readNum_{G1/G2} is a 0. | |
81 | |
82 ]]></help> | |
83 <citations> | |
84 <citation type="bibtex">@ARTICLE{Miller20BASE, | |
85 author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre}, | |
86 title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)}, | |
87 journal = {????}, | |
88 year = {submitted for publication} | |
89 }</citation> | |
90 </citations> | |
91 </tool> |