view mutspecNmf.xml @ 1:748b7a8b634c draft

author iarc
date Thu, 21 Apr 2016 09:36:32 -0400
parents 8c682b3a7c5b
children 916846f73e25
line wrap: on
line source

<?xml version="1.0"?>
<tool id="mutSpecnmf" name="MutSpec NMF" version="0.0.1">
<description>Extract mutation signatures with the Non negative Matrix Factorization algorithm</description>

    <requirement type="set_environment">SCRIPT_PATH</requirement>
    <requirement type="package" version="5.18.1">perl</requirement>
    <requirement type="package" version="3.1.2">R</requirement>
    <requirement type="package" version="1.7.1">numpy</requirement>
    <requirement type="package" version="0.1">mutspec</requirement>

<command interpreter="bash">
	"--nbSign $nbsign"
	#if $refGenomeSource.source == "html":
	#end if

	<conditional name="refGenomeSource">
		<param name="source" type="select" label="Input a MutSpec Stats report or a matrix" help="You may select either a report generated by MutSpec-Stats or a tab-delimited text matrix">
			<option value="html">Dataset generated by the tool MutSpec-Stats</option>
			<option value="tab">Tab-delimited matrix</option>
		<when value="html">
			<param name="reportHTML" type="data" format="html" label="Input dataset" help="Select a report generated by the MutSpec-Stats tool"/>
		<when value="tab">
			<param name="matrix" type="data" format="tabular" label="Input matrix" help="Select a matrix formatted as shown further below"/>
	<param name="nbsign" type="text" value="2" label="Number of expected signatures" help="min=2" />

	<data name="html_file" format="html" label="NMF result on ${on_string} ($nbsign signatures)" />


**What it does**

Extract mutation signatures composed of 96 SBS types (6 SBS types in their trinucleotide sequence context) using the non-negative matrix (`NMF`__) factorisation algorithm of Brunet with the Kullback-Leibler divergence penalty implemented in a `R package`__.

.. __:
.. __:


**Input formats**

The tool accepts a HTML report produces by the tool MutSpec-Stats or a matrix of mutation count in a tab-delimited text file format (see example below).

.. class:: warningmark

If the input is a matrix of mutation count, the sum of mutation counts for each row should be not null.



Matrices and graphs representing the composition of the mutation signatures found by NMF (Matrix W) and the contributions of each sample to the signatures (Matrix H). The tool also produces a matrice that can be used with the tool MutSpec-compare for comparing the identified signatures with known signatures.


**Example: matrix of mutation count (96 rows + a header with the samples names)**

|        | Sample_1 | Sample_2 | Sample_3 |
|A[C>A]A |     4    |     3    |     1    |
|A[C>T]A |     2    |     1    |     0    |
|A[C>G]A |    13    |     2    |     1    |
|A[T>A]A |    10    |     3    |     6    |
|A[T>C]A |     9    |     6    |     1    |
|A[T>G]A |     2    |     1    |     0    |
|                  ...                    |
|T[C>A]T |     5    |     2    |     2    |
|T[C>G]T |     5    |     2    |     0    |
|T[C>T]T |    11    |     4    |     2    |
|T[T>A]T |     3    |     0    |     5    |
|T[T>C]T |    39    |    17    |     1    |
|T[T>G]T |    12    |     8    |     1    |


    <citation type="bibtex">
            author = {Ardin et al},
            keywords = {Galaxy, Mutation signatures, Mutation spectra, Single base substitutions},
            title = {{MutSpec}: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes},
            url = {}
