Mercurial > repos > iarc > mutspec
comparison mutspecAnnot.xml @ 7:eda59b985b1c draft default tip
Uploaded
author | iarc |
---|---|
date | Mon, 13 Mar 2017 08:21:19 -0400 |
parents | 46a10309dfe2 |
children |
comparison
equal
deleted
inserted
replaced
6:46a10309dfe2 | 7:eda59b985b1c |
---|---|
32 <outputs> | 32 <outputs> |
33 <data name="output" type="data" format="tabular" label="${input.name} annotated" /> | 33 <data name="output" type="data" format="tabular" label="${input.name} annotated" /> |
34 </outputs> | 34 </outputs> |
35 | 35 |
36 <stdio> | 36 <stdio> |
37 <regex match="ANNOVAR LOG FILE" | 37 <regex match="Missing flag !" source="stderr" level="fatal" description="You have forgotten to specify one or more arguments" /> |
38 source="stdout" | 38 <regex match="Error message:" source="stderr" level="fatal" description="Read error message for more details" /> |
39 level="fatal" | 39 <regex match="ANNOVAR LOG FILE" source="stderr" level="fatal" description="Read Annovar log file for more information" /> |
40 description="Read Annovar log file for more information" /> | |
41 </stdio> | 40 </stdio> |
42 | 41 |
43 <help> | 42 <help> |
44 | 43 |
45 **What it does** | 44 **What it does** |
46 | 45 |
47 MutSpect-Annot provides functional annotations from `ANNOVAR software`__ (Feb 2016 version is provided here), as well as the strand transcript orientation (from refGene database) and sequence context of variants (extrated from the reference genome selected). | 46 MutSpect-Annot provides functional annotations from `ANNOVAR software`__ (Feb 2016 version is provided here), as well as the strand transcript orientation (from refGene database) and sequence context of variants (extrated from the reference genome selected). |
48 | 47 |
49 .. __: http://www.openbioinformatics.org/annovar/ | 48 .. __: http://www.openbioinformatics.org/annovar/ |
50 | 49 |
50 .. class:: infomark | |
51 | |
52 MutSpect-Annot works for human, mouse and rat genomes. | |
53 | |
51 -------------------------------------------------------------------------------------------------------------------------------------------------- | 54 -------------------------------------------------------------------------------------------------------------------------------------------------- |
52 | 55 |
53 **Input formats** | 56 **Input formats** |
54 | 57 |
55 MutSpect-Annot accepts files in VCF (version 4.1 and 4.2) or in tab-delimited (TAB) format. | 58 MutSpect-Annot accepts files in VCF (version 4.1 and 4.2) or in tab-delimited (TAB) format. |
62 | 65 |
63 Filenames must be <= 31 characters. | 66 Filenames must be <= 31 characters. |
64 | 67 |
65 .. class:: warningmark | 68 .. class:: warningmark |
66 | 69 |
67 These files should contain at least four columns describing for each variant, the chromosome number, the start genomic position, the reference allele and the alternate allele | 70 Files should contain at least four columns describing for each variant: the chromosome number, the start genomic position, the reference allele and the alternate alleles. These columns can be in any order. |
71 | |
72 .. class:: warningmark | |
73 | |
74 If multiple input files are specified they should be from the **same genome build** and in the **same format**. | |
68 | 75 |
69 .. class:: warningmark | 76 .. class:: warningmark |
70 | 77 |
71 The tool supports different column names (**names are case-sensitive**) depending on the source file as follows: | 78 The tool supports different column names (**names are case-sensitive**) depending on the source file as follows: |
72 | 79 |
73 **mutect** : contig position ref_allele alt_allele | 80 **mutect** : contig position ref_allele alt_allele |
74 | 81 |
82 **vcf** : version `4.1`__ and `4.2`__ | |
83 | |
84 .. __: https://samtools.github.io/hts-specs/VCFv4.1.pdf | |
85 .. __: https://samtools.github.io/hts-specs/VCFv4.2.pdf | |
86 | |
75 **cosmic** : Mutation_GRCh37_chromosome_number Mutation_GRCh37_genome_position Description_Ref_Genomic Description_Alt_Genomic | 87 **cosmic** : Mutation_GRCh37_chromosome_number Mutation_GRCh37_genome_position Description_Ref_Genomic Description_Alt_Genomic |
76 | 88 |
77 **icgc** : chromosome chromosome_start reference_genome_allele mutated_to_allele | 89 **icgc** : chromosome chromosome_start reference_genome_allele mutated_to_allele |
78 | 90 |
79 **tcga** : Chromosome Start_position Reference_Allele Tumor_Seq_Allele2 | 91 **tcga** : Chromosome Start_position Reference_Allele Tumor_Seq_Allele2 |
82 | 94 |
83 **proton** : Chrom Position Ref Variant | 95 **proton** : Chrom Position Ref Variant |
84 | 96 |
85 **varScan2** : Chrom Position Ref VarAllele | 97 **varScan2** : Chrom Position Ref VarAllele |
86 | 98 |
99 **varScan2 somatic** : chrom position ref var | |
100 | |
87 **annovar** : Chr Start Ref Obs | 101 **annovar** : Chr Start Ref Obs |
88 | 102 |
89 **custom** : Chromosome Start Wild_Type Mutant | 103 **custom** : Chromosome Start Wild_Type Mutant |
90 | 104 |
91 .. class:: infomark | 105 .. class:: infomark |
92 | 106 |
93 For MuTect and MuTect2 output files, only confident calls are considered (variants containing the string REJECT in the judgement column or not passing MuTect2 filters, are not annotated and excluded from the MutSpect-Annot output) as other calls are very likely to be dubious calls or artefacts. | 107 For MuTect and MuTect2 output files, only confident calls are considered as other calls are very likely to be dubious calls or artefacts. |
94 | 108 Variants containing the string REJECT in the judgement column or not passing MuTect2 filters are not annotated and excluded from the MutSpect-Annot output. |
95 .. class:: infomark | 109 |
96 | 110 .. class:: infomark |
97 For COSMIC and ICGC files, variants are reported on several transcripts. These duplicate variants need to be remove before annotated the file. | 111 |
98 | 112 For COSMIC and ICGC files, variants are reported on several transcripts. These duplicate variants need to be removed before annotating the file. |
99 .. class:: warningmark | |
100 | |
101 If multiple input files are specified they should be from the **same genome build** and in the **same format**. | |
102 | 113 |
103 | 114 |
104 -------------------------------------------------------------------------------------------------------------------------------------------------- | 115 -------------------------------------------------------------------------------------------------------------------------------------------------- |
105 | 116 |
106 **Output** | 117 **Output** |
107 | 118 |
108 The output is a tabular text file, that contains the retrieved annotations in the first columns and all columns from the original file at the end. | 119 The output is a tabular text file, that contains the retrieved annotations in the first columns and all columns from the original file at the end. |
109 | 120 |
110 .. class:: infomark | 121 .. class:: infomark |
111 | 122 |
112 Variants on chromosome M and random chromosomes are not considered for the annotation and excluded from MutSpec-Annot output. | 123 Only classic chromosomes are considered for the annotation, all other chromosomes are excluded from MutSpec-Annot output. |
124 For example for human genome only chr1 to chrY are annotated. | |
113 | 125 |
114 The following annotations are retrieved: | 126 The following annotations are retrieved: |
115 | 127 |
116 **ANNOVAR annotations** | 128 **ANNOVAR annotations** |
117 | 129 |
118 An example of annotations retrieved by the tool: | 130 An example of annotations retrieved by the tool (for the full list please visit the Galaxy pages `Annovar databases`__) |
131 | |
132 .. __: http://galaxy.iarc.fr/galaxy/u/ardinm/p/annovar-databases | |
119 | 133 |
120 Gene-based: RefSeqGene, UCSC Known Gene and Ensembl Gene | 134 Gene-based: RefSeqGene, UCSC Known Gene and Ensembl Gene |
121 | 135 |
122 Region-based: localization of the variant on cytogenetic band (cytoBand), variant reported in Genome-Wide association studies (gwasCatalog) and variant mapped to segmental duplications (genomicSuperDups) | 136 Region-based: localization of the variant on cytogenetic band (cytoBand), variant reported in Genome-Wide association studies (gwasCatalog) and variant mapped to segmental duplications (genomicSuperDups) |
123 | 137 |
160 chr7 121717919 121717920 - G exonic AASS frameshift insertion AASS:NM_005763:exon23:c.2634dupC:p.A879fs NA rs147476318 NA NA - GCG chr7 121717919 121717920 - G | 174 chr7 121717919 121717920 - G exonic AASS frameshift insertion AASS:NM_005763:exon23:c.2634dupC:p.A879fs NA rs147476318 NA NA - GCG chr7 121717919 121717920 - G |
161 chr1 230846235 230846235 T A exonic AGT nonsynonymous SNV AGT:NM_000029:exon2:c.A362T:p.H121L NA NA NA NA - GTG chr1 230846235 230846235 T A | 175 chr1 230846235 230846235 T A exonic AGT nonsynonymous SNV AGT:NM_000029:exon2:c.A362T:p.H121L NA NA NA NA - GTG chr1 230846235 230846235 T A |
162 chr14 33290999 33290999 A G exonic AKAP6 nonsynonymous SNV AKAP6:NM_004274:exon13:c.A3980G:p.D1327G NA NA NA NA + GAC chr14 33290999 33290999 A G | 176 chr14 33290999 33290999 A G exonic AKAP6 nonsynonymous SNV AKAP6:NM_004274:exon13:c.A3980G:p.D1327G NA NA NA NA + GAC chr14 33290999 33290999 A G |
163 chr12 8082458 8082458 C T exonic SLC2A3 nonsynonymous SNV SLC2A3:NM_006931:exon6:c.G683A:p.R228Q NA rs200481428 0.000199681 NA - CCG chr12 8082458 8082458 C T | 177 chr12 8082458 8082458 C T exonic SLC2A3 nonsynonymous SNV SLC2A3:NM_006931:exon6:c.G683A:p.R228Q NA rs200481428 0.000199681 NA - CCG chr12 8082458 8082458 C T |
164 chr4 70156391 70156391 T C exonic UGT2B28 nonsynonymous SNV UGT2B28:NM_053039:exon5:c.T1172C:p.V391A score=0.949699;Name=chr4:70035680 NA 0.000199681 NA + GTA chr4 70156391 70156391 T C | 178 chr4 70156391 70156391 T C exonic UGT2B28 nonsynonymous SNV UGT2B28:NM_053039:exon5:c.T1172C:p.V391A score=0.949699;Name=chr4:70035680 NA 0.000199681 NA + GTA chr4 70156391 70156391 T C |
179 | |
180 | |
181 -------------------------------------------------------------------------------------------------------------------------------------------------- | |
182 | |
183 **Contact** | |
184 | |
185 ardinm@fellows.iarc.fr; cahaisv@iarc.fr | |
186 | |
187 -------------------------------------------------------------------------------------------------------------------------------------------------- | |
188 | |
189 **Code** | |
190 | |
191 The source code is available on `GitHub`__ | |
192 | |
193 .. __: https://github.com/IARCbioinfo/mutspec.git | |
165 | 194 |
166 | 195 |
167 </help> | 196 </help> |
168 | 197 |
169 | 198 |