view cmstat.xml @ 1:55bb96edfc07 draft

author bgruening
date Thu, 24 Apr 2014 15:02:05 -0400
parents 652f9d550531
children fac157e22e1b
line wrap: on
line source

<tool id="infernal_cmstat" name="Summary statistics" version="">
    <description>for covariance model (cmstat)</description>
        <requirement type="package">infernal</requirement>
        <requirement type="package" version="1.1">infernal</requirement>
        <requirement type="package" version="8.21">gnu_coreutils</requirement>
        ## a temp file is needed, because the standard tabular output from infernal is not usefull in Galaxy
        ## it will be converted to a tab delimited file and piped to Galaxy


            #if str($cm_opts.cm_opts_selector) == "db":
            #end if

            > \$temp_tabular_output

            ## 1. replace all lines starting # (comment lines)
            ## 2. replace the first 18 spaces with tabs, 18th field is a free text field (can contain spaces)
            sed -e 's/#.*$//' -e '/^$/d' -e 's/ /\t/g' -e 's/\t/ /18g' \$temp_tabular_output > $outfile

            <conditional name="cm_opts">
                <param name="cm_opts_selector" type="select" label="Subject covariance models">
                  <option value="db" selected="True">Locally installed covariance models</option>
                  <option value="histdb">Covariance model from your history</option>
                <when value="db">
                    <param name="database" type="select" label="Covariance models">
                        <options from_file="infernal.loc">
                          <column name="value" index="0"/>
                          <column name="name" index="1"/>
                          <column name="path" index="2"/>
                <when value="histdb">
                    <param name="cmfile" type="data" format="cm" label="Covariance models file from the history."/>
        <data format="tabular" name="outfile" label="cmstat on ${on_string}"/>

**What it does**

The cmstat utility prints out a tabular file of summary statistics for each given covariance model.

Output format

By default, cmstat prints general statistics of the model and the alignment it was built from, one line per model in a
tabular format. 

The columns are:

(1) The index of this profile, numbering each on in the file starting from 1.
(2) The name of the profile.
(3) The optional accession of the profile, or ”-” if there is none.
(4) The number of sequences that the profile was estimated from.
(5) The effective number of sequences that the profile was estimated from, after Infernal applied an effective sequence number calculation such as the default entropy weighting.
(6) The length of the model in consensus residues (match states).
(7) The expected maximum length of a hit to the model.
(8) The number of basepairs in the model.
(9) The number of bifurcations in the model.
(10) What type of model will be used by default in cmsearch and cmscan for this profile, either ”cm” or ”hmm”. For profiles with 0 basepairs, this will be ”hmm” (unless the --nohmmonly option is used). For all other profiles, this will be ”cm”.
(11) Mean relative entropy per match state, in bits. This is the expected (mean) score per con-
     sensus position. This is what the default entropy-weighting method for effective sequence
     number estimation focuses on, so for default Infernal, this value will often reflect the default
     target for entropy-weighting. If the ”model” field for this profile is ”hmm”, this field will be ”-”.
(12) Mean relative entropy per match state, in bits, if the CM were transformed into an HMM (information from structure is ignored). The larger the difference between the CM and HMM
     relative entropy, the more the model will rely on structural conservation relative sequence conservation when identifying homologs.

For further questions please refere to the Infernal Userguide_.

.. _Userguide:

How do I cite Infernal?

The recommended citation for using Infernal 1.1 is E. P. Nawrocki and S. R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches , Bioinformatics 29:2933-2935 (2013). 

**Galaxy Wrapper Author**::

    *  Bjoern Gruening, University of Freiburg