Mercurial > repos > iarc > mutspec
view mutspecFilter.xml @ 4:916846f73e25 draft
Uploaded
author | iarc |
---|---|
date | Fri, 29 Apr 2016 05:11:28 -0400 |
parents | 9d363eb081b5 |
children | 46a10309dfe2 |
line wrap: on
line source
<tool id="MutSpecfilter" name="MutSpec Filter" version="0.1" hidden="false"> <description>Filter out variants present in public databases</description> <requirements> <requirement type="set_environment">SCRIPT_PATH</requirement> <requirement type="package" version="5.18.1">perl</requirement> </requirements> <command interpreter="perl"> mutspecFilter.pl --dir \$SCRIPT_PATH $segDup $esp $thG #if str($FilterdbSNP.dbSNP) == "true" or $FilterdbSNP.dbSNP == True: --dbSNP ${FilterdbSNP.column} #else --dbSNP 0 #end if --refGenome ${refGenome} --outfile $output $input </command> <inputs> <param name="input" type="data" format="txt" label="Input file"/> <param name="refGenome" type="select" label="Reference genome" help="All your data should have been annotated with the selected genome"> <options from_data_table="annovar_index" /> </param> <conditional name="FilterdbSNP"> <param name="dbSNP" type="boolean" checked="true" truevalue="true" label="Filter against dbSNP database" help="Remove variants with a RS number" /> <when value="true"> <param name="column" type="data_column" data_ref="input" label="Select the dbSNP column for filtering" use_header_names="true" help="Select a column name snp or snpNonFlagged" /> </when> </conditional> <param name="segDup" type="boolean" checked="true" truevalue="--segDup" falsevalue="" label="Filter against SegDup database" help="Remove variants present at >= 0.9 frequency in the genomic duplicate segments database" /> <param name="esp" type="boolean" checked="true" truevalue="--esp" falsevalue="" label="Filter against the ESP database" help="Remove variants present at frequency > 0.001 in the Exome Sequencing Project database (only valid for human genomes)" /> <param name="thG" type="boolean" checked="true" truevalue="--thG" falsevalue="" label="Filter against the 1000g database project" help="Remove variants present at frequency > 0.001 in the 1000 genome database (only valid for human genomes)" /> </inputs> <outputs> <data type="data" name="output" format="tabular" label="${input.name.split(' ')[0]} filtered" /> </outputs> <help> **What it does** Filter a file annotated with MutSpec-Annot tool. Variants present in public databases (dbSNP, SegDup, ESP, 1000 genome obtained from Annovar) will be removed from the input file (with frequency limits described above). .. class:: warningmark The databases ESP and 1000 genome can be used only for human genomes -------------------------------------------------------------------------------------------------------------------------------------------------- **Input** .. class:: warningmark Tab delimited text files generated by MutSpec-Annot tool. -------------------------------------------------------------------------------------------------------------------------------------------------- **Output** Tab delimited text file filtered for variants considered as neutral polymorphisms. -------------------------------------------------------------------------------------------------------------------------------------------------- **Example** Filter the following file:: Chr Start End Ref Alt Func.refGene Gene.refGene ExonicFunc.refGene AAChange.refGene genomicSuperDups snp138 1000g2014oct_all esp6500si_all Strand context Chromosome Start_Position End_Position Reference_Allele Tumor_Seq_Allele2 chr7 121717919 121717920 - G exonic AASS frameshift insertion AASS:NM_005763:exon23:c.2634dupC:p.A879fs NA rs147476318 NA NA - GCG chr7 121717919 121717920 - G chr1 230846235 230846235 T A exonic AGT nonsynonymous SNV AGT:NM_000029:exon2:c.A362T:p.H121L NA NA NA NA - GTG chr1 230846235 230846235 T A chr14 33290999 33290999 A G exonic AKAP6 nonsynonymous SNV AKAP6:NM_004274:exon13:c.A3980G:p.D1327G NA NA NA NA + GAC chr14 33290999 33290999 A G chr12 8082458 8082458 C T exonic SLC2A3 nonsynonymous SNV SLC2A3:NM_006931:exon6:c.G683A:p.R228Q NA rs200481428 0.000199681 NA - CCG chr12 8082458 8082458 C T chr4 70156391 70156391 T C exonic UGT2B28 nonsynonymous SNV UGT2B28:NM_053039:exon5:c.T1172C:p.V391A score=0.949699;Name=chr4:70035680 NA 0.000199681 NA + GTA chr4 70156391 70156391 T C Will produce:: Chr Start End Ref Alt Func.refGene Gene.refGene ExonicFunc.refGene AAChange.refGene genomicSuperDups snp138 1000g2014oct_all esp6500si_all Strand context Chromosome Start_Position End_Position Reference_Allele Tumor_Seq_Allele2 chr1 230846235 230846235 T A exonic AGT nonsynonymous SNV AGT:NM_000029:exon2:c.A362T:p.H121L NA NA NA NA - GTG chr1 230846235 230846235 T A chr14 33290999 33290999 A G exonic AKAP6 nonsynonymous SNV AKAP6:NM_004274:exon13:c.A3980G:p.D1327G NA NA NA NA + GAC chr14 33290999 33290999 A G chr4 70156391 70156391 T C exonic UGT2B28 nonsynonymous SNV UGT2B28:NM_053039:exon5:c.T1172C:p.V391A score=0.949699;Name=chr4:70035680 NA 0.000199681 NA + GTA chr4 70156391 70156391 T C </help> <citations> <citation type="bibtex"> @article{ardin_mutspec:_2016, title = {{MutSpec}: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes}, volume = {17}, issn = {1471-2105}, doi = {10.1186/s12859-016-1011-z}, shorttitle = {{MutSpec}}, abstract = {{BACKGROUND}: The nature of somatic mutations observed in human tumors at single gene or genome-wide levels can reveal information on past carcinogenic exposures and mutational processes contributing to tumor development. While large amounts of sequencing data are being generated, the associated analysis and interpretation of mutation patterns that may reveal clues about the natural history of cancer present complex and challenging tasks that require advanced bioinformatics skills. To make such analyses accessible to a wider community of researchers with no programming expertise, we have developed within the web-based user-friendly platform Galaxy a first-of-its-kind package called {MutSpec}. {RESULTS}: {MutSpec} includes a set of tools that perform variant annotation and use advanced statistics for the identification of mutation signatures present in cancer genomes and for comparing the obtained signatures with those published in the {COSMIC} database and other sources. {MutSpec} offers an accessible framework for building reproducible analysis pipelines, integrating existing methods and scripts developed in-house with publicly available R packages. {MutSpec} may be used to analyse data from whole-exome, whole-genome or targeted sequencing experiments performed on human or mouse genomes. Results are provided in various formats including rich graphical outputs. An example is presented to illustrate the package functionalities, the straightforward workflow analysis and the richness of the statistics and publication-grade graphics produced by the tool. {CONCLUSIONS}: {MutSpec} offers an easy-to-use graphical interface embedded in the popular Galaxy platform that can be used by researchers with limited programming or bioinformatics expertise to analyse mutation signatures present in cancer genomes. {MutSpec} can thus effectively assist in the discovery of complex mutational processes resulting from exogenous and endogenous carcinogenic insults.}, pages = {170}, number = {1}, journaltitle = {{BMC} Bioinformatics}, author = {Ardin, Maude and Cahais, Vincent and Castells, Xavier and Bouaoun, Liacine and Byrnes, Graham and Herceg, Zdenko and Zavadil, Jiri and Olivier, Magali}, date = {2016}, pmid = {27091472}, keywords = {Galaxy, Mutation signatures, Mutation spectra, Single base substitutions} } </citation> </citations> </tool>