comparison check_aln_design_file.xml @ 0:e979cb57a5d5 draft default tip

"planemo upload for repository https://github.com/McIntyre-Lab/BayesASE/tree/main/galaxy commit 9b70598ef46a73632d9e0fa0c6ce6776fb5e9d6a"
author malex
date Thu, 14 Jan 2021 21:51:36 +0000
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:e979cb57a5d5
1 <tool id="base_check_alignment_design_file" name="Check Alignment Design File" version="21.1.13">
2 <description>for correct formatting and duplicate FASTQ names </description>
3 <macros>
4 <import>macros.xml</import>
5 </macros>
6 <expand macro="requirements" />
7 <command><![CDATA[
8 check_aln_design_file.py
9 --design=$design
10 --logfile=$logfile
11 --dups=$dups
12 ]]></command>
13 <inputs>
14 <param name="design" type="data" format="tabular,tsv" label="Alignment Design file" help="Design file containing FASTQ file names, sampleIDs, etc. Refer to the help section at the bottom of this page for the format. [Required]"/>
15 </inputs>
16 <outputs>
17 <data format="tabular" name="dups" label="${tool.name} on ${on_string}: FASTQ Duplicates"/>
18 <data format="tabular" name="logfile" label="${tool.name} on ${on_string}: Alignment Design Criteria"/>
19 </outputs>
20 <tests>
21 <test>
22 <param name="design" ftype="data" value="align_and_counts_test_data/alignment_design_file.tsv"/>
23 <output name="dups" ftype="data" file="align_and_counts_test_data/alignment_design_file_duplicates.tabular" />
24 <output name="logfile" ftype="data" file="align_and_counts_test_data/alignment_design_file_criteria.csv" />
25 </test>
26 </tests>
27 <help><![CDATA[
28 **Tool Description**
29
30 This tool checks to make sure the Alignment design file is formatted correctly and has all needed headers. Also verifies that there are no duplicate FastQ file names.
31 Default values are already in place to save user time.
32
33 **NOTE:** There are two Design Files that must be created and supplied by the user. They are the *Alignment Design File* and the *Comparate Design File*.
34
35 All others (Sample and Priors) are created and used as intermediates by the BASE workflows.
36
37 **The design file should contain the following columns, in order:**
38
39 (1) G1 - name of parental genome 1 for alignment
40 (2) G2 - name of parental genome 2 for alignment
41 (3) sampleID - sample identifier (no spaces)
42 (4) fqName - name of the column containing the FASTQ file names, WITHOUT the extension
43 (5) fqExtension - FASTQ file extension, for example, .fq or .fastq (NOT gzipped)
44 (6) techRep - name of the column containing the technical replicates for each sampleID, for example, the same library run on different lanes.
45 (7) readLength - the read length in base pairs
46
47 The sample identifier must contain the biological replicate number, and the comparate conditions to be tested in the Bayesian Model for Allelic Imbalance.
48
49 An example of a comparate condition is W1118_F, where W1118 is the genome and F refers to female. There must be at least two comparate conditions for Bayesian Analysis.
50
51 In the example below, there are two comparate condtions, W55_Mated and W55_Virgin, and E1 refers to the biological replicate number.
52
53 An example design file::
54
55 G1 G2 sampleID fqName fqExtension techRep readLength
56 W1118 W55 mel_W55_Mated_E1 mel_W55_Mated_E1_R1 .fq 1 150
57 W1118 W55 mel_W55_Mated_E1 mel_W55_Mated_E1_R2 .fq 2 150
58 W1118 W55 mel_W55_Mated_E1 mel_W55_Mated_E1_R3 .fq 3 150
59 W1118 W55 mel_W55_Virgin_E1 mel_W55_Virgin_E1_R1 .fq 1 150
60 W1118 W55 mel_W55_Virgin_E1 mel_W55_Virgin_E1_R2 .fq 2 150
61 W1118 W55 mel_W55_Virgin_E1 mel_W55_Virgin_E1_R3 .fq 3 150
62
63 If using simulated reads, include the technical replicate column, but label the technical replicates with the same number
64
65 Example design file for simulated data ::
66
67 G1 G2 sampleID fqName fqExtension techRep readLength
68 W1118 W55 W55_M_1 SRR1989586_1 .fq 1 96
69 W1118 W55 W55_M_2 SRR1989588_1 .fq 1 96
70 W1118 W55 W55_V_1 SRR1989592_1 .fq 1 96
71 W1118 W55 W55_V_2 SRR1989594_1 .fq 1 96
72
73 **Outputs**
74
75 (1) a logfile with information about whether the column names are correct and if there are any duplicated FASTQ file names.
76 (2) a text file containing a list of any duplicated FASTQ file names, if present.
77
78 ]]></help>
79 <citations>
80 <citation type="bibtex">@ARTICLE{Miller20BASE,
81 author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre},
82 title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)},
83 journal = {????},
84 year = {submitted for publication}
85 }</citation>
86 </citations>
87 </tool>