Galaxy | Tool Preview

Beagle (version 5.2_21Apr21.304+galaxy1)
It specifies a VCF file containing genotypes for the study samples. Each VCF record must contain a GT (genotype) format field
Optional input files
Optional input files 0
Input format: [chrom]:[start]-[end]. The entire chromosome, the beginning, or the end may be specified by chrom=[chrom], chrom=[chrom]:-[end], and chrom=[chrom]:[start]-, respectively
The default value is suitable for a large, outbred population. It is needed to specify an appropriate effective populations size if you are imputing ungenotyped markers in a small or inbred population
The window parameter must be at least 1.1 times as large as the overlap parameter. The window parameter controls the amount of memory required for the analysis
It specifies the cM length of overlap between adjacent sliding windows
If no err parameter is specified, the err parameter will be set equal 𝜃/(2(𝜃 + 𝐻)) where 𝜃 = 1/(0.5 + ln 𝐻) and 𝐻 is the number of haplotypes
A random seed is a number used to initialize a pseudorandom number generator
Phasing parameters
Phasing parameters 0
Imputation parameters
Imputation parameters 0

Purpose

Beagle is a program for phasing and imputing missing genotypes. Sporadic missing genotypes are imputed during phasing. If a reference panel of phased genotypes is specified with the ref argument, ungenotyped markers that are present in the reference panel can also be imputed.

Beagle version 5.2 provides significantly faster genotype phasing than version 5.1. Recent versions of Beagle do not infer genotypes from genotype likelihood input data, but Beagle versions 4.0 and 4.1 have this capability.


HapMap genetic maps

HapMap genetic maps in PLINK format for GRCh36, GRCh37, and GRCh38 are available in this links


Input files

Beagle uses Variant Call Format (VCF) 4.3 for input and output genotype data. Pseuodoautosomal and non-pseudoautosomal X-chromosome genotypes must be in separate input files and analysed separately unless male haploid genotypes are coded as homozygous diploid genotypes.

In the VCF file, if any heterozygote genotype is unphased (with "/" allele separator) in a marker window, it will consider all heterozygote genotypes to be unphased, regardless of the allele separator used ("|" or "/"). Beagle assumes that an the VCF file has a name ending in ".gz" is compressed with gzip or bgzip, and that a reference VCF file that has a name ending in “.bref3” is compressed with bref version 3.


Output files

There are two output files. The log file gives a summary of the analysis that includes the Beagle version, the command line arguments, and compute time.

The vcf.gz file is a bgzip-compressed VCF file that contains phased, non-missing genotypes for all non-reference samples. The output vcf.gz file can be uncompressed with the unix gunzip utility.

If a reference panel is specified and ungenotyped markers are imputed, the VCF INFO field will contain:

- A "DR2" subfield with the estimated squared correlation between the estimated allele dose and the true allele dose.
- An "AF" subfield with the estimated alternate allele frequencies in the target samples.
- The "IMP" flag if the marker is imputed.