annotate mutspecSplit.xml @ 4:916846f73e25 draft

Uploaded
author iarc
date Fri, 29 Apr 2016 05:11:28 -0400
parents 748b7a8b634c
children 46a10309dfe2
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
1 <tool id="mutSpecsplit" name="MutSpec Split" version="0.1" hidden="false" force_history_refresh="True">
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
2 <description>Split a tabular file by sample ID</description>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
3
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
4 <requirements>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
5 <requirement type="set_environment">SCRIPT_PATH</requirement>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
6 <requirement type="package" version="5.18.1">perl</requirement>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
7 </requirements>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
8
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
9 <command interpreter="perl">
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
10 mutspecSplit.pl -f $input -c $column
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
11 </command>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
12
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
13 <inputs>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
14 <param name="input" type="data" format="tabular" label="Input file" help="If using the batch mode (multiple datasets), all files must contain the same sample id column. The tool doesn't support dataset list as input !" />
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
15 <param name="column" type="data_column" data_ref="input" label="Split by" use_header_names="true"/>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
16 </inputs>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
17
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
18 <outputs>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
19 <collection name="splitted_output" type="list" label="collection">
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
20 <discover_datasets pattern="__name__" ext="tabular" directory="outputFiles"/>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
21 </collection>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
22 </outputs>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
23
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
24 <help>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
25
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
26 **What it does**
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
27
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
28 This tool splits a file into several files based on the content of the selected column.
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
29 It can be used for example to split a file that contains data on 10 samples into 10 files using the same sample ID column.
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
30 The resulting files are saved into a dataset list/collection.
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
31
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
32 --------------------------------------------------------------------------------------------------------------------------------------------------
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
33
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
34 **Input**
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
35
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
36 One or multiple tab delimited text files.
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
37
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
38 If multiple files are selected, they should all have the same column on which you want to do the split.
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
39
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
40 .. class:: warningmark
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
41
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
42 The tool doesn't support dataset list as input !!!
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
43
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
44 --------------------------------------------------------------------------------------------------------------------------------------------------
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
45
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
46 **Output**
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
47
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
48 A dataset list containing tab delimited text files resulting from splitting the input file(s).
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
49
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
50 .. class:: warningmark
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
51
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
52 If a large number of file are generated, you'll need to refresh the history to see all files included in the dataset list. The entire list of file may still not be correctly displayed due to a known bug in Galaxy that may be fixed in future versions.
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
53
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
54 --------------------------------------------------------------------------------------------------------------------------------------------------
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
55
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
56 **Example**
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
57
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
58 Split by sample ID the following file::
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
59
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
60 Chr Start End Ref Alt Func.refGene Gene.refGene ExonicFunc.refGene AAChange.refGene genomicSuperDups 1000g2012apr_all snp137 esp6500si_all cosmic67 Strand Context Mutation_GRCh37_chromosome_number Mutation_GRCh37_genome_position Description_Ref_Genomic Description_Alt_Genomic Sample_name Pubmed_PMID Age Comments
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
61 chr12 82752552 82752552 G A exonic METTL25 nonsynonymous SNV NM_032230:c.G208A:p.E70K NA NA NA NA NA + GTCGGAGACGGAGGCCCTGCC chr12 82752552 G A APA29 23913001 2 NA
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
62 chr11 86663436 86663436 C A exonic FZD4 nonsynonymous SNV NM_012193:c.G362T:p.C121F NA NA NA NA NA - GACTGAAAGACACATGCCGCC chr11 86663436 C A APA12 21311022 34 Tissue Remark Fixed:Remark
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
63 chr12 57872994 57872994 G A exonic ARHGAP9 nonsynonymous SNV NM_001080157:c.C196T:p.R66C NA NA NA 0.000077 ID=COSM431582;OCCURENCE=2(breast) - GCTTCTAGGCGTCTTGCCAAC chr12 57872994 G A APA12 21311022 34 Tissue Remark Fixed:Remark
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
64
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
65
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
66 Will create a dataset list with two dataset:
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
67
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
68
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
69 APA29::
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
70
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
71 Chr Start End Ref Alt Func.refGene Gene.refGene ExonicFunc.refGene AAChange.refGene genomicSuperDups 1000g2012apr_all snp137 esp6500si_all cosmic67 Strand Context Mutation_GRCh37_chromosome_number Mutation_GRCh37_genome_position Description_Ref_Genomic Description_Alt_Genomic Sample_name Pubmed_PMID Age Comments
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
72 chr12 82752552 82752552 G A exonic METTL25 nonsynonymous SNV NM_032230:c.G208A:p.E70K NA NA NA NA NA + GTCGGAGACGGAGGCCCTGCC chr12 82752552 G A APA29 23913001 2 NA
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
73
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
74
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
75 APA12::
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
76
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
77 Chr Start End Ref Alt Func.refGene Gene.refGene ExonicFunc.refGene AAChange.refGene genomicSuperDups 1000g2012apr_all snp137 esp6500si_all cosmic67 Strand Context Mutation_GRCh37_chromosome_number Mutation_GRCh37_genome_position Description_Ref_Genomic Description_Alt_Genomic Sample_name Pubmed_PMID Age Comments
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
78 chr11 86663436 86663436 C A exonic FZD4 nonsynonymous SNV NM_012193:c.G362T:p.C121F NA NA NA NA NA - GACTGAAAGACACATGCCGCC chr11 86663436 C A APA12 21311022 34 Tissue Remark Fixed:Remark
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
79 chr12 57872994 57872994 G A exonic ARHGAP9 nonsynonymous SNV NM_001080157:c.C196T:p.R66C NA NA NA 0.000077 ID=COSM431582;OCCURENCE=2(breast) - GCTTCTAGGCGTCTTGCCAAC chr12 57872994 G A APA12 21311022 34 Tissue Remark Fixed:Remark
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
80
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
81
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
82
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
83 </help>
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
84
1
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
85
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
86 <citations>
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
87 <citation type="bibtex">
4
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
88 @article{ardin_mutspec:_2016,
1
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
89 title = {{MutSpec}: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes},
4
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
90 volume = {17},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
91 issn = {1471-2105},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
92 doi = {10.1186/s12859-016-1011-z},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
93 shorttitle = {{MutSpec}},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
94 abstract = {{BACKGROUND}: The nature of somatic mutations observed in human tumors at single gene or genome-wide levels can reveal information on past carcinogenic exposures and mutational processes contributing to tumor development. While large amounts of sequencing data are being generated, the associated analysis and interpretation of mutation patterns that may reveal clues about the natural history of cancer present complex and challenging tasks that require advanced bioinformatics skills. To make such analyses accessible to a wider community of researchers with no programming expertise, we have developed within the web-based user-friendly platform Galaxy a first-of-its-kind package called {MutSpec}.
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
95 {RESULTS}: {MutSpec} includes a set of tools that perform variant annotation and use advanced statistics for the identification of mutation signatures present in cancer genomes and for comparing the obtained signatures with those published in the {COSMIC} database and other sources. {MutSpec} offers an accessible framework for building reproducible analysis pipelines, integrating existing methods and scripts developed in-house with publicly available R packages. {MutSpec} may be used to analyse data from whole-exome, whole-genome or targeted sequencing experiments performed on human or mouse genomes. Results are provided in various formats including rich graphical outputs. An example is presented to illustrate the package functionalities, the straightforward workflow analysis and the richness of the statistics and publication-grade graphics produced by the tool.
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
96 {CONCLUSIONS}: {MutSpec} offers an easy-to-use graphical interface embedded in the popular Galaxy platform that can be used by researchers with limited programming or bioinformatics expertise to analyse mutation signatures present in cancer genomes. {MutSpec} can thus effectively assist in the discovery of complex mutational processes resulting from exogenous and endogenous carcinogenic insults.},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
97 pages = {170},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
98 number = {1},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
99 journaltitle = {{BMC} Bioinformatics},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
100 author = {Ardin, Maude and Cahais, Vincent and Castells, Xavier and Bouaoun, Liacine and Byrnes, Graham and Herceg, Zdenko and Zavadil, Jiri and Olivier, Magali},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
101 date = {2016},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
102 pmid = {27091472},
916846f73e25 Uploaded
iarc
parents: 1
diff changeset
103 keywords = {Galaxy, Mutation signatures, Mutation spectra, Single base substitutions}
1
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
104 }
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
105 </citation>
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
106 </citations>
748b7a8b634c Uploaded
iarc
parents: 0
diff changeset
107
0
8c682b3a7c5b Uploaded
iarc
parents:
diff changeset
108 </tool>