view crossSampleOverview.xml @ 2:a64ece32a01a draft default tip

"planemo upload for repository https://github.com/ImmPortDB/immport-galaxy-tools/tree/master/flowtools/cs_overview commit a46097db0b6056e1125237393eb6974cfd51eb41"
author azomics
date Tue, 28 Jul 2020 08:32:36 -0400
parents bca68066a957
children
line wrap: on
line source

<tool id="cross_sample_overview" name="Generate overview information" version="1.1+galaxy1">
  <description>of a multi-sample analysis.</description>
  <requirements>
    <requirement type="package" version="1.0.5">pandas</requirement>
    <requirement type="package" version="2.11.2">jinja2</requirement>
  </requirements>
  <stdio>
    <exit_code range="1" level="fatal" description="There are too many populations in the input files. The maximum number of populations is 40." />
    <exit_code range="2" level="fatal" description="There are populations inconsistencies between provided inputs." />
    <exit_code range="3:"/>
  </stdio>
  <command><![CDATA[
    python '$__tool_directory__/crossSampleOverview.py' -i '${input}' -I '${inputmfi}' -o '${html_file}' -m '${mfi}' -d '${html_file.files_path}' -t '$__tool_directory__'
    #for $f in $cs_outputs
       -s $f
    #end for
    ;
    cp -r '$__tool_directory__'/js '${html_file.files_path}/';
    cp -r '$__tool_directory__'/css '${html_file.files_path}/';
    cp -r '$__tool_directory__'/images '${html_file.files_path}/';
  ]]>
  </command>
  <inputs>
    <param format="flowstat1" name="input" type="data" label="CrossSample Population Summary Table"/>
    <param format="flowstat2" name="inputmfi" type="data" label="Centroids MFI Summary Table"/>
    <param format="flowmfi" name="mfi" type="data" label="Centroids File from FLOCK"/>
    <param format="flowclr" name="cs_outputs" type="data_collection" collection_type="list" label="Clustered flow files" help="output from a Cross Sample analysis or from mapping individual files to a FlowSOM reference tree."/>
  </inputs>
  <outputs>
    <data format="html" name="html_file" label="Overview of ${input.name}">
    </data>
  </outputs>
  <tests>
    <test>
      <param name="input" value="input.flowstat1"/>
      <param name="inputmfi" value="input.flowstat2"/>
      <param name="mfi" value="input.mfi"/>
      <param name="cs_outputs">
        <collection type="list">
          <element name="input1.flowclr" value="input1.flowclr"/>
          <element name="input2.flowclr" value="input2.flowclr"/>
          <element name="input3.flowclr" value="input3.flowclr"/>
        </collection>
      </param>
      <output name="html_file" file="out.html" compare="sim_size">
        <extra_files type="file" name="csAllMFIs.tsv" value="csAllMFIs.tsv"/>
        <extra_files type="file" name="csOverview.tsv" value="csOverview.tsv"/>
        <extra_files type="file" name="csBoxplotData.json" value="csBoxplotData.json" compare="sim_size"/>
      </output>
    </test>
  </tests>
  <help><![CDATA[
   This tool generates an overview of the flow analysis results.

-----

**Input**

Input files are tab-separated files containing counts of events in each population for each file compared to reference population patterns, MFI for each marker in each population for each file, and individual files used in the comparison. The output from a Cross Sample analysis as well as from mapping individual files to a FlowSOM reference tree are suitable files.
The Centroids file can be generated with the following Flow Analysis tool:

- Generate the centroids from a flow result file

.. class:: warningmark

The number of populations or clusters this tool can handle is limited to 40.

**Output**

The output is a page that allows visualization of the data.

.. class:: warningmark

*The output of this tool is interactive. However, comments or any other modifications made are not saved when exiting the view.*

-----

**Example**

*Input*

Clustered Flow Data - fluorescence intensities per marker and population ID per event::

   Marker1 Marker2 Marker3 ... Population
   33      47      11      ... 1
   31      64      11      ... 6
   21      62      99      ... 2
   14      34      60      ... 7
   ...     ...     ...     ... ...


Population distribution table::

   Filename SampleName Pop1 Pop2 Pop3 ...
   File1    Biosample1 0.1  0.25 0.14 ...
   File2    Biosample2 0.02 0.1  0.17 ...
   File3    Biosample3 0.4  0.05 0.21 ...
   File4    Biosample4 0.05 0.3  0.08 ...
   ...      ...        ...  ...  ...  ...

Centroids MFI Summary table::

   Marker1 Marker2 Marker3 ... Population Percentage SampleName
   154     885     24      ... 1          0.2        Biosample1
   458     74      574     ... 2          0.3        Biosample1
   3       210     86      ... 3          0.05       Biosample1
   ...     ...     ...     ... ...        ...        ...
   140     921     19      ... 1          0.1        Biosample2
   428     79      508     ... 2          0.25       Biosample2
   9       225     90      ... 3          0.3        Biosample2
   ...     ...     ...     ... ...        ...        ...

Centroids file::

   Population Marker1 Marker2 Marker3 ...
   1          38      49      10      ...
   2          21      63      100     ...
   3          31      52      45      ...
   4          11      78      25      ...
   ...        ...     ...     ...     ...

*Output*

Summary of the population distribution:

The comment field of this table is editable, as well as the population names. The edited values are used to populate the legends of the other graphs. The columns are re-orderable. The 'col visibility' button allows to choose which columns to display. 'CSV', 'PDF' and 'Copy' respectively allow to download a comma-separated values file, a pdf version or to copy to your clipboard in a tab-separated format the current view of the table.

.. image:: ./static/images/flowtools/popdistrib.png

.. image:: ./static/images/flowtools/edit_popdistrib.png


Stacked plot of the population distribution in each sample:

The user can choose which populations to display, and whether to see the data as a bar plot or area plot. The graph displays the proportion of each population within the set of selected populations. The Plotly toolbar allows more control over the display of the graph. There is an option to save the plot as a png file.

.. image:: ./static/images/flowtools/stackedA.png

.. image:: ./static/images/flowtools/stackedB.png


Parallel coordinates representation of populations:

The user can reorder the populations, and choose which samples to display either by selecting them in the legend via checkboxes or by selecting them on the graph. Data selected for display is shown in the table below the graph. Mousing over a line in that table highlights the corresponding line on the graph. The 'col visibility' button allows to choose which columns to display. 'CSV', 'PDF' and 'Copy' respectively allow to download a comma-separated values file, a pdf version or to copy to your clipboard in a tab-separated format the current view of the table.

.. image:: ./static/images/flowtools/pcpop.png


Parallel coordinates representation of the data:

The user can reorder the markers, and choose which samples and/or populations to display either by selecting them in the legends via checkboxes or by selecting them on the graph. Data selected for display is shown in the table below the graph. Mousing over a line in that table highlights the corresponding line on the graph. The 'col visibility' button allows to choose which columns to display. 'CSV', 'PDF' and 'Copy' respectively allow to download a comma-separated values file, a pdf version or to copy to your clipboard in a tab-separated format the current view of the table.

.. image:: ./static/images/flowtools/pcmfi.png


Summary Statistics Boxplots:

The user can choose whether to group the boxplots per marker or per population. By default, the values displayed are the 25th, median and 75th percentiles. The whiskers represent 1.5 times the interquartile range. The MFI or the values can be displayed by checking the corresponding checkboxes. The number of markers that can be plotted simultaneously is limited to 5. The number of outliers per data point is limited to 100. If there are more than 100 outliers, they are downsampled randomly to a 100 and a warning message is displayed.

.. image:: ./static/images/flowtools/cs_bpt.png
  ]]>
  </help>
</tool>