Boxplot: Transcriptome data | Boxplot: Proteome data |
---|---|
![]() |
![]() |
Boxplot: Transcriptome data | Boxplot: Proteome data |
---|---|
![]() |
![]() |
Boxplot: Transcriptome data | Boxplot: Proteome data |
---|---|
![]() |
![]() |
Transcript Fold-Change | Protein Fold-Change |
---|---|
![]() |
![]() |
PCA plot: Transcriptome data | PCA plot: Proteome data |
---|---|
![]() |
![]() |
Scatter plot between Proteome and Transcriptome Abundance | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() |
||||||||||||
Below we use Cook's distance based approach to identify such influential observations. |
Assuming a linear relationship between Proteome and Transcriptome data, we here fit a linear regression model.
Parameter | Value |
---|---|
Formula | PE_abundance~TE_abundance |
Coefficients | |
(Intercept) | -0.06910598 (Pvalue: 1.220723e-05 ) |
TE_abundance | 0.1712395 (Pvalue: 4.168015e-10 ) |
Model parameters | |
Residual standard error | 0.8363295 ( 2815 degree of freedom) |
F-statistic | 39.31142 ( on 1 and 2815 degree of freedom) |
R-squared | 0.01377265 |
Adjusted R-squared | 0.0134223 |
1) Residuals vs Fitted plot |
2) Normal Q-Q plot of residuals |
---|---|
![]() | ![]() |
This plot checks for linear relationship assumptions. If a horizontal line is observed without any distinct patterns, it indicates a linear relationship. |
This plot checks whether residuals are normally distributed or not. It is good if the residuals points follow the straight dashed line i.e., do not deviate much from dashed line. |
Residuals from Regression | |
---|---|
Parameter | Value |
Mean Residual value | 1.942328e-17 |
Standard deviation (Residuals) | 0.836181 |
Total outliers (Residual value > 2 standard deviation from the mean) | 164 (Download these 164 data points with high residual values here) |
(Download the complete residuals data here) |
3) Residuals vs Leverage plot |
---|
![]() |
This plot is useful to identify any influential cases, that is outliers or extreme values. They might influence the regression results upon inclusion or exclusion from the analysis. |
Cook's distance computes the influence of each data point/observation on the predicted outcome. i.e. this measures how much the observation is influencing the fitted values.
In general use, those observations that have a Cook's distance > than 4 times the mean may be classified as influential.
Parameter | Value |
---|---|
Mean Cook's distance | 0.0004875011 |
Total influential observations (Cook's distance > 4 * mean Cook's distance) | 115 |
Observations with Cook's distance < 4 * mean Cook's distance | 2702 |
Scatterplot: Before removal | Scatterplot: After removal | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() |
![]() | ||||||||||||||||||||||||
|
|
Gene | Protein Log Fold-Change | Transcript Log Fold-Change | Cook's Distance |
---|---|---|---|
CATHL2 | -1.960863 | 4.88565 | 0.1432189 |
CD177 | -4.173263 | 2.057499 | 0.06826605 |
CATHL1 | -0.9912973 | 4.835209 | 0.05767091 |
HP | 2.570727 | 3.885549 | 0.04680496 |
AZU1 | -2.226356 | -5.561874 | 0.03737565 |
ELANE | -2.732479 | -2.914936 | 0.03266198 |
PYGM | -0.06079228 | 6.071712 | 0.03242859 |
LTF | -2.4294 | 2.129742 | 0.02725017 |
ATP1A2 | 0.2871971 | 6.446299 | 0.01939256 |
C13H20orf194 | -5.640732 | -0.6697401 | 0.01852927 |
Heatmap of PE and TE abundance values (Hierarchical clustering) | Number of clusters to extract: 5 |
---|---|
![]() | |
Download the hierarchical cluster list |
K-mean clustering | Number of clusters: 4 |
---|---|
![]() | |
Download the cluster list |