+ help="number of new variables (components) computed by the data integration" /> + help="maximum number of iterations performed by block.splsda" /> @@ -130,7 +128,8 @@ - + + @@ -147,8 +146,12 @@ Description ----------- -The blocks.splsda function is part of the mixOmics package for exploration and integration of Omics datasets. -Performs N-integration and feature selection with Projection to Latent Structures models (PLS) with sparse Discriminant Analysis. +The blocks.splsda function is part of the mixOmics package for exploration and integration of omics datasets. +This data integration takes as input parameters different omics datasets +(transcriptomic, metabolomic, metagenomic, ...) and a response variable (e.g. for a sample, the value of the response +variable is equal to « Treated » or « Control »). This data integration returns, for each omics dataset, variables +which are correlated with the variables of the other omic datasets and the response variable. The other functions of +this pipeline allow visualizing this correlated variables thanks to correlation circles and networks. ----------------- Workflow position @@ -183,18 +186,21 @@ | 2 : [opt] Variables metadata | tabular | +------------------------------+------------+ +1. Data matrix structure +The data matrix is in tabular format (.tsv). +The first column contains the variables names. +The first row contains the samples names. +Samples names must be in the same order for all blocks and the sample metadata (transposed). The data must not contain missing values. + +2. Variables metadata structure +The variables metadata is in tabular format (.tsv). +The first colum contains the variables names. +The first row contains the metadata column names. +The number of rows in the metadata file must be the same than the number of rows in the block data file, and the variables need to be in the same order. If a metadata file is provided, block.splsda output will be appended as new columns, otherwise a new file will be created. + Variables metadata files are optional. If a file is provided, output metadata will be appended to the input file, otherwise a new output file will be created. -1. Data matrix format - * Rows = variables, Columns = samples - * First row = samples name. MUST be the same and in the same order in every block as well as in the sample metadata file (transposed) - * First column = variables name - -2. Variables metadata format - * Rows = variables, Columns = metadata - * First row = metadata column names - * First column = variables names. MUST be the same and in the same order than in the associated data matrix Global input files: ------------------- @@ -205,13 +211,11 @@ | 1 : Samples metadata | tabular | +-----------------------------+------------+ -By default, the last column of the samples metadata matrix will be used as samples description factors. -If it's not the case, the column number can be inputed in the `Sample description column number` parameter. - -1. Samples metadata format - * Rows = samples, Columns = metadata - * First row = metadata column names - * First column = sample names. These names must be identical (transposed) and in the same order than for the blocks data matrices +1. Samples metadata structure +Samples metadata is in tabular format (.tsv). +The first column contains the sample names. +The first row contains the metadata column names. +Samples names must be in the same order in samples metadata (transposed) and all the blocks. One of the column (the last by default) must contain the samples groups for integration. ---------- Parameters @@ -270,4 +274,4 @@ - \ No newline at end of file + diff -r df8428358b7f -r d4e9f7546dfa mixomics_plotindiv.xml --- a/mixomics_plotindiv.xml Fri Oct 23 11:26:18 2020 +0000 +++ b/mixomics_plotindiv.xml Tue Nov 17 13:01:44 2020 +0000 @@ -1,4 +1,4 @@ - + provides scatter plots for individuals (experimental units) representation in (sparse)(I)PCA,(regularized)CCA, (sparse)PLS(DA) and (sparse)(R)GCCA(DA) @@ -21,14 +21,12 @@ --output_pdf $output_pdf - @COMMAND_LOG_EXIT@ - ]]> + help="this is the RData output file from the block.splsda function" />

@@ -60,8 +58,7 @@ Description ----------- -The plotIndiv function is part of the mixOmics package for exploration and integration of Omics datasets. -Provides scatter plots for individuals (experimental units) representation in (sparse)(I)PCA,(regularized)CCA, (sparse)PLS(DA) and (sparse)(R)GCCA(DA). +This tool allows visualizing the samples on a two dimensionnal graphic. An effect can be visualized along the abscissa axis and along the ordinate axis. ----------------- Workflow position diff -r df8428358b7f -r d4e9f7546dfa mixomics_plotvar.xml --- a/mixomics_plotvar.xml Fri Oct 23 11:26:18 2020 +0000 +++ b/mixomics_plotvar.xml Tue Nov 17 13:01:44 2020 +0000 @@ -1,4 +1,4 @@ - + provides variables representation for (regularized) CCA, (sparse) PLS regression, PCA and (sparse) Regularized generalised CCA @@ -21,20 +21,18 @@ --output_pdf $output_pdf - @COMMAND_LOG_EXIT@ - ]]> + help="this is the RData output file from the block.splsda function" />

+ help="only selected variables whose correlation with the first or second axis is greater than Cut-off in absolute value will be plotted" />

@@ -61,8 +59,8 @@ Description ----------- -The plotVar function is part of the mixOmics package for exploration and integration of Omics datasets. -Provides variables representation for (regularized) CCA, (sparse) PLS regression, PCA and (sparse) Regularized generalised CCA. +This tool allows visualizing the variables of a omics dataset which are correlated with the variables +of the other omic datasets and the response variable in a correlation circle. ----------------- Workflow position diff -r df8428358b7f -r d4e9f7546dfa test-data/mixomics_blocksplsda_output.rdata Binary file test-data/mixomics_blocksplsda_output.rdata has changed diff -r df8428358b7f -r d4e9f7546dfa viscorvar_circlecor.xml --- a/viscorvar_circlecor.xml Fri Oct 23 11:26:18 2020 +0000 +++ b/viscorvar_circlecor.xml Tue Nov 17 13:01:44 2020 +0000 @@ -1,4 +1,4 @@ - + plots a correlation circle for the datasets whose correlation circles can be superimposed. This correlation circle contains the selected variables of these datasets which are included in a rectangle and the response variables. @@ -29,17 +29,17 @@ @COMMAND_LOG_EXIT@ ]]> - + help="this is the RData output file from the matCorAddVar function"/> + help="output *_blocks_comb.tsv file from matCorAddVar."/> + help="each element of List of blocks vector contain blocks for which selected variables can + be visualized in the correlation circle"> @@ -47,11 +47,11 @@ + help="choose the response variables which will be plotted in the correlation circle"> @@ -89,7 +89,11 @@ Description ----------- -Bla bla... +This tool allows visualizing variables of omics datasets which are correlated with +the response variables thanks to correlation circles. The determination of the omics +datasets which can be visualized is made by the tool matCorAddVar. This tool performs +a zoom in a rectangle to retrieve omics datasets variables which are correlated with a +response variable. ----------------- Workflow position diff -r df8428358b7f -r d4e9f7546dfa viscorvar_computematsimilarity.xml --- a/viscorvar_computematsimilarity.xml Fri Oct 23 11:26:18 2020 +0000 +++ b/viscorvar_computematsimilarity.xml Tue Nov 17 13:01:44 2020 +0000 @@ -1,4 +1,4 @@ - + performs the computation of the similarities. The similarity between two variables is an approximation of the correlation between these two variables. @@ -21,11 +21,10 @@ @COMMAND_LOG_EXIT@ ]]> - + help="this is the RData output file from matCorAddVar function"/> @@ -47,7 +46,9 @@ Description ----------- -Bla bla... +This tool is a pre-processing step in order to create networks. It computes an +approximation of the correlation between a variable of a omics dataset and a variable of +an other omics dataset. ----------------- Workflow position diff -r df8428358b7f -r d4e9f7546dfa viscorvar_matcoraddvar.xml --- a/viscorvar_matcoraddvar.xml Fri Oct 23 11:26:18 2020 +0000 +++ b/viscorvar_matcoraddvar.xml Tue Nov 17 13:01:44 2020 +0000 @@ -1,4 +1,4 @@ - + determine the correlation circles that can be overlaid and compute the correlations @@ -34,14 +34,14 @@ + help="this is the RData output file from the block.splsda function" /> + help="Block Y is a table. A column determines which sample is associated with a phenotype (value equals to 1) or not (value equals to 0). For the file structure, see below in the section Input files" /> + help="variables not belonging to any block will not be considered. For the file structure, see below in the section Input files"/> @@ -68,7 +68,12 @@ Description ----------- -Bla bla... +This tool is a pre-processing step of the pipeline. It computes the correlations +between omics datasets variables, variables of interest (optional), response variables and +the components which are output of the data integration. The variables of interest are omics +datasets variables that will be added to the network. It also determines the omics datasets +for which the correlated variables of these omics datasets can be visualized with correlation +circles and networks. ----------------- Workflow position @@ -110,6 +115,15 @@ | 3 : [opt] Variables of interest | txt | +-----------------------------------------+-----------+ +2. Block Y structure +Block Y is in tabular format (.tsv). +This table contains the name of the samples in the first column. +The other columns correspond to phenotypes. +For each of these other columns, a column determines which sample is associated with a phenotype (value equals to 1) or not (value equals to 0). The names of the samples in Block Y (transposed), in the sample metadata (transposed) and for all datasets have to be in the same order. + +3. Variables of interest structure +All the variables of interest are in the same column. + ---------- Parameters ---------- diff -r df8428358b7f -r d4e9f7546dfa viscorvar_networkvar.xml --- a/viscorvar_networkvar.xml Fri Oct 23 11:26:18 2020 +0000 +++ b/viscorvar_networkvar.xml Tue Nov 17 13:01:44 2020 +0000 @@ -1,4 +1,4 @@ - + creates a network between selected variables of datasets and the response variables. In the network, the similarity between two variables is associated with the link between these two variables. @@ -23,10 +23,10 @@ @COMMAND_LOG_EXIT@ ]]> - +