Galaxy | Tool Preview

cmstat (version

What it does

The cmstat utility prints out a tabular file of summary statistics for each given covariance model.

Output format

By default, cmstat prints general statistics of the model and the alignment it was built from, one line per model in a tabular format.

The columns are:

  1. The index of this profile, numbering each on in the file starting from 1.
  2. The name of the profile.
  3. The optional accession of the profile, or ”-” if there is none.
  4. The number of sequences that the profile was estimated from.
  5. The effective number of sequences that the profile was estimated from, after Infernal applied an effective sequence number calculation such as the default entropy weighting.
  6. The length of the model in consensus residues (match states).
  7. The expected maximum length of a hit to the model.
  8. The number of basepairs in the model.
  9. The number of bifurcations in the model.
  10. What type of model will be used by default in cmsearch and cmscan for this profile, either ”cm” or ”hmm”. For profiles with 0 basepairs, this will be ”hmm” (unless the --nohmmonly option is used). For all other profiles, this will be ”cm”.
  11. Mean relative entropy per match state, in bits. This is the expected (mean) score per con- sensus position. This is what the default entropy-weighting method for effective sequence number estimation focuses on, so for default Infernal, this value will often reflect the default target for entropy-weighting. If the ”model” field for this profile is ”hmm”, this field will be ”-”.
  12. Mean relative entropy per match state, in bits, if the CM were transformed into an HMM (information from structure is ignored). The larger the difference between the CM and HMM relative entropy, the more the model will rely on structural conservation relative sequence conservation when identifying homologs.

For further questions please refere to the Infernal Userguide.