What it does
This tool outputs a tab-delimited file of chromosome names and their lengths, as they are in the selected genome/SnpEff database. The output can be used to validate and rename chromosomes in VCF files in order to annotate its records.
Note, make sure that the genome you select from the snpEff database precisely matches the one used in your analysis. As a cursory check, you can use the chromosome lengths in this output to match those in your reference, however the lengths can match, but the version may still differ.
Known issue: this strategy will not work if more than 1 chromosome in the same genome has the same length.
The usage scenario
Suppose you want to use snpEff to annotate a VCF file that was generated using an mouse reference with a different chromosome naming convention than in the snpEff database. To do this you can:
- Use SnpEff databases to find the precise genome name for mouse data (e.g. "mm10") as it appears in the snpEff database.
- List the chromosome names using this tool. Either select a built-in genome, one in your history, or select "Download on demand" and enter the genome version obtained in the previous step (which only actually downloads if snpEff doesn't already have it).
- Check that the chromosomes in the SnpEff database are the same as the reference you used (e.g. as a cursory check, ensure the chromosome lengths reported from the SnpEff database match those of your reference).
- Edit your vcf file to replace the chromosome names with the ones the SnpEff database uses.
- Use SnpEff eff and supply the edited VCF file.