Galaxy | Tool Preview

Frameshift Deletions Checks (version 0.5.2+galaxy0)
Fasta file containing the sample's consensus sequence (majority, with indels)
Input BAM file with sample's sequencing reads, aligned against the reference
To find insertions and deletions, the tool needs information how the consensus aligns to the reference (lift-over). You can provide a .chain file describing how the consensus maps to the reference, otherwise mafft will be used to align the consensus to the reference.
Chain file describing how the consensus is aligned to the reference (e.g. ouput of `bcftools consensus --chain …`).
Select built-in genome files to base reported positions and annotations on the SARS-CoV-2 reference sequence NC_045512.2. If you have mapped to a different reference, select custom genome files and provide the reference sequence and genomic feature annotations for it in fasta and gff format, repsectively.
Output format options
Output format options 0

Produces a report about frameshifting indels in a consensus sequences.

The smallgenomeutilities are part of the V-pipe workflow for analysing NGS data of short viral genomes.

Columns signification:

  • ref_id / cons_id: name of the sequence in the reference and consensus
  • start_position / length: location of the variant
  • VARIANT: one of: "insertion", "deletion", "stopgain" or "stoploss"
  • gene_region: Gene in which the deletion is found according to --genes argument;
  • reads_all: Total number of reads covering the indel;
  • reads_fwd: Total number of forward reads covering the indel;
  • reads_rev: Total number of reverse reads covering the indel;
  • deletions / insertions: Number of reads supporting the deletion/insertion;
  • freq_del / freq_insert: Fraction of reads supporting the deletion/insertion;
  • matches_ref: number of reads that matche with the reference base;
  • pos_critical_inserts: Start positions of insertions in the same gene_region that occur in > 40% of reads;
  • pos_critical_dels: Start positions of deletions in the same gene_region that occur in > 40% of reads;
  • homopolymeric: True if either around the start or end position of the deletion three bases are the same, which may have caused the polymerase to skip during reverse transcription of viral RNA to cDNA, e.g. AATAG;
  • ref_base: base in the reference genome;
  • variant_position_english: english sentence describing the indel or stop;
  • variant_diagnosis: english sentence with the indel diagnosis