Between-table Correlation (version 1.0.0)

**Author:**
Ophelie Barbet for original code (PFEM - INRA)
**Maintainer:** Melanie Petera (PFEM - INRA - MetaboHUB)

Allows to visualise links existing between two data tables, with the creation of a correlation table between the variables of these tables, and a heatmap representing the correlation table colored according to the coefficients.

Parameter | Format |
---|---|

1 : Table 1 file | tabular |

2 : Table 2 file | tabular |

The two input tables must have the same sample IDs.

Essential to correctly calculate the correlations.

- 'Pearson': Measures the intensity of the linear association between two continuous variables.- The 'Spearman' and 'Kendall' methods are explained in the R documentation of the 'cor' function as follows: " Kendall's tau or Spearman's rho statistic is used to estimate a rank-based measure of association. These are more robust and have been recommended if the data do not necessarily come from a bivariate normal distribution.".

This test is performed on each correlation coefficient, with the following hypotheses:H0: The correlation coefficient is not significantly different from zero.H1: The correlation coefficient is significantly different from zero.Coefficients whose null hypothesis (H0) are not rejected are replaced by zeros in the correlation table.

The 7 methods implemented in the 'p.adjust' R function are available and documented as follows:

"The adjustment methods include the Bonferroni correction ("bonferroni") in which the p-values are multiplied by the number of comparisons. Less conservative corrections are also included by Holm (1979) ("holm"), Hochberg (1988) ("hochberg"), Hommel (1988) ("hommel"), Benjamini and Hochberg (1995) ("BH" or its alias "fdr"), and Benjamini and Yekutieli (2001) ("BY"), respectively. A pass-through option ("none") is also included. The set of methods are contained in the p.adjust.methods vector for the benefit of methods that need to have the method as an option and pass it on to p.adjust. The first four methods are designed to give strong control of the family-wise error rate. There seems no reason to use the unmodified Bonferroni correction because it is dominated by Holm's method, which is also valid under arbitrary assumptions. Hochberg's and Hommel's methods are valid when the hypothesis tests are independent or when they are non-negatively associated (Sarkar, 1998; Sarkar and Chang, 1997). Hommel's method is more powerful than Hochberg's, but the difference is usually small and the Hochberg p-values are faster to compute. The "BH" (aka "fdr") and "BY" method of Benjamini, Hochberg, and Yekutieli control the false discovery rate, the expected proportion of false discoveries amongst the rejected hypotheses. The false discovery rate is a less stringent condition than the family-wise error rate, so these methods are more powerfil than the others."

A value between 0 and 1, usually 0.05.

Allows to reduce the correlation table size by keeping only variables considered relevant.

- 'Only zero filter': Remove variables with all their correlation coefficients equal to zero.

- 'Threshold filter': Remove variables with all their correlation coefficients (in absolute value) strictly below a threshold.

Allows to set some parameters for the correlation table output and the pdf file.

Allows the most linked variables to be close in the correlation table.

A HCA is performed on each input tables, with:

- 1 - correlation coefficient, as distance

- Ward as aggregation method.

- 'Default': generates a pdf file with a colored correlation table if the filtered table has no dimension above 1000 (number of lines or columns).

- 'Always plot a colored table': used when you are not afraid of huge colored correlation table; to be used wisely.

- 'No colored table': the module will generate the correlation table in tabular format only (no pdf file).

Only available whenPDF outputis set to 'Default' or 'Always plot a colored table'.Allows to create a colored correlation table. Variables of table 1 and variables of table 2 are related using colored rectangles.About the colors, the negative correlations are in red, more or less intense according to their position between -1 and 0, and the positive correlations in green, more or less intense according to their position between 0 and 1. The coefficients equal to 0 are in white.- 'Standard': the graphical representation has a scale with a smooth gradient between three colors: red, white and green.- 'Customized': the colored correlation table has coefficient classes. It is possible to create regular or irregular classes. The scale is discreet.

- 'Regular': classes are all (or almost) the same size.

To realize these intervals, we start from 1 to go to 0 by taking a step of the size chosen by the user, and we make the symmetry for -1 towards 0. If the last step does not fall on the 0 value, we create a class between this last value and 0, smaller in size than the others. It is important to specify that 0 represents a class on its own, which is assigned the color white for the heatmap.

Example: if the size is 0.4, classes are [-1;-0.6], ]-0.6;-0.2], ]-0.2;0[, 0, ]0;0.2], ]0.2;0.6] and ]0.6;1].

- 'Irregular': classes have variable lengths.

It is possible to do as many classes as you want, and of any size. There is not necessarily symmetry between -1 and 0, and 0 and 1. You can choose to have a white class with only 0, or an interval which contains the value 0.

Example: if the vector is (-0.8,-0.5,-0.4,0,0.4,0.5,0.8), the classes are [-1;-0.8], ]-0.8;-0.5], ]-0.5;-0.4], ]-0.4;0[, 0, ]0;0.4], ]0.4;0.5], ]0.5;0.8] and ]0.8;1].

Tabular outputCorrelation table between the variables of the two input tables

Pdf outputColored representation of the correlation table. The coefficients are replaced by colors. A coefficient close to -1 is red, close to 0 white, and close to 1 green.