| Boxplot: Transcriptome data | Boxplot: Proteome data |
|---|---|
![]() |
![]() |
| Boxplot: Transcriptome data | Boxplot: Proteome data |
|---|---|
![]() |
![]() |
| Boxplot: Transcriptome data | Boxplot: Proteome data |
|---|---|
![]() |
![]() |
| Transcript Fold-Change | Protein Fold-Change |
|---|---|
![]() |
![]() |
| PCA plot: Transcriptome data | PCA plot: Proteome data |
|---|---|
![]() |
![]() |
| Scatter plot between Proteome and Transcriptome Abundance | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() |
||||||||||||
Below we use Cook's distance based approach to identify such influential observations. |
Assuming a linear relationship between Proteome and Transcriptome data, we here fit a linear regression model.
| Parameter | Value |
|---|---|
| Formula | PE_abundance~TE_abundance |
| Coefficients | |
| (Intercept) | -0.06910598 (Pvalue: 1.220723e-05 ) |
| TE_abundance | 0.1712395 (Pvalue: 4.168015e-10 ) |
| Model parameters | |
| Residual standard error | 0.8363295 ( 2815 degree of freedom) |
| F-statistic | 39.31142 ( on 1 and 2815 degree of freedom) |
| R-squared | 0.01377265 |
| Adjusted R-squared | 0.0134223 |
1) Residuals vs Fitted plot |
2) Normal Q-Q plot of residuals |
|---|---|
![]() | ![]() |
| This plot checks for linear relationship assumptions. If a horizontal line is observed without any distinct patterns, it indicates a linear relationship. |
This plot checks whether residuals are normally distributed or not. It is good if the residuals points follow the straight dashed line i.e., do not deviate much from dashed line. |
| Residuals from Regression | |
|---|---|
| Parameter | Value |
| Mean Residual value | 1.942328e-17 |
| Standard deviation (Residuals) | 0.836181 |
| Total outliers (Residual value > 2 standard deviation from the mean) | 164 (Download these 164 data points with high residual values here) |
| (Download the complete residuals data here) | |
3) Residuals vs Leverage plot |
|---|
![]() |
| This plot is useful to identify any influential cases, that is outliers or extreme values. They might influence the regression results upon inclusion or exclusion from the analysis. |
Cook's distance computes the influence of each data point/observation on the predicted outcome. i.e. this measures how much the observation is influencing the fitted values.
In general use, those observations that have a Cook's distance > than 4 times the mean may be classified as influential.
| Parameter | Value |
|---|---|
| Mean Cook's distance | 0.0004875011 |
| Total influential observations (Cook's distance > 4 * mean Cook's distance) | 115 |
| Observations with Cook's distance < 4 * mean Cook's distance | 2702 |
| Scatterplot: Before removal | Scatterplot: After removal | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() |
![]() | ||||||||||||||||||||||||
|
|
| Gene | Protein Log Fold-Change | Transcript Log Fold-Change | Cook's Distance |
|---|---|---|---|
| CATHL2 | -1.960863 | 4.88565 | 0.1432189 |
| CD177 | -4.173263 | 2.057499 | 0.06826605 |
| CATHL1 | -0.9912973 | 4.835209 | 0.05767091 |
| HP | 2.570727 | 3.885549 | 0.04680496 |
| AZU1 | -2.226356 | -5.561874 | 0.03737565 |
| ELANE | -2.732479 | -2.914936 | 0.03266198 |
| PYGM | -0.06079228 | 6.071712 | 0.03242859 |
| LTF | -2.4294 | 2.129742 | 0.02725017 |
| ATP1A2 | 0.2871971 | 6.446299 | 0.01939256 |
| C13H20orf194 | -5.640732 | -0.6697401 | 0.01852927 |
| Heatmap of PE and TE abundance values (Hierarchical clustering) | Number of clusters to extract: 5 |
|---|---|
![]() | |
| Download the hierarchical cluster list | |
| K-mean clustering | Number of clusters: 4 |
|---|---|
![]() | |
| Download the cluster list | |