Mercurial > repos > rsajulga > quantp
comparison test-data/output.html @ 0:0072d7fe861a draft
planemo upload
| author | rsajulga |
|---|---|
| date | Thu, 20 Dec 2018 15:12:46 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:0072d7fe861a |
|---|---|
| 1 <html><head></head><body> | |
| 2 <h1><u>QuanTP: Association between abundance ratios of transcript and protein</u></h1><hr/> | |
| 3 <font><h3>Input data summary</h3></font> | |
| 4 <ul> | |
| 5 <li>Abbreviations used: PE (Proteome data) and TE (Transcriptome data) </li><br> | |
| 6 <li>Input Proteome data dimension (Row Column): 2817 x 5 </li> | |
| 7 <li>Input Transcriptome data dimension (Row Column): 2817 x 5 </li></ul><hr/> | |
| 8 <h3 id=table_of_content>Table of Contents:</h3> | |
| 9 <ul> | |
| 10 <li><a href=#sample_dist>Sample distribution</a></li> | |
| 11 <li><a href=#corr_data>Correlation</a></li> | |
| 12 <li><a href=#regression_data>Regression analysis</a></li> | |
| 13 <li><a href=#inf_obs>Influential observations</a></li> | |
| 14 <li><a href=#cluster_data>Cluster analysis</a></li></ul><hr/> | |
| 15 <h2 id="sample_dist"><font color=#ff0000>SAMPLE DISTRIBUTION</font></h2> | |
| 16 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> | |
| 17 <tr bgcolor="#7a0019"><th><font color=#ffcc33>Boxplot: Transcriptome data</font></th><th><font color=#ffcc33>Boxplot: Proteome data</font></th></tr> | |
| 18 <tr><td align=center> <img src="Box_TE_all_rep.png" width=500 height=500></td> | |
| 19 <td align=center> <img src="Box_PE_all_rep.png" width=500 height=500></td></tr></table> | |
| 20 <br><font color="#ff0000"><h3>Sample wise distribution (Box plot) after using mean on replicates </h3></font><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Boxplot: Transcriptome data</font></th><th><font color=#ffcc33>Boxplot: Proteome data</font></th></tr> | |
| 21 <tr><td align=center> <img src="Box_TE_rep.png" width=500 height=500></td> | |
| 22 <td align=center> <img src="Box_PE_rep.png" width=500 height=500></td></tr></table> | |
| 23 <br><font color="#ff0000"><h3>Distribution (Box plot) of log fold change </h3></font><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Boxplot: Transcriptome data</font></th><th><font color=#ffcc33>Boxplot: Proteome data</font></th></tr> | |
| 24 <tr><td align=center> <img src="Box_TE.png" width=500 height=500></td> | |
| 25 <td align=center> <img src="Box_PE.png" width=500 height=500></td></tr></table> | |
| 26 <br><br><font size=5><b><a href='PE_TE_logfold_pval.txt' target='_blank'>Download the complete fold change data here</a></b></font><br> | |
| 27 <br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Transcript Fold-Change</font></th><th><font color=#ffcc33>Protein Fold-Change</font></th></tr> | |
| 28 <tr><td align=center> <img src="TE_volcano.png" width=600 height=600></td> | |
| 29 <td align=center> <img src="PE_volcano.png" width=600 height=600></td></tr></table><br> | |
| 30 <br><br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>PCA plot: Transcriptome data</font></th><th><font color=#ffcc33>PCA plot: Proteome data</font></th></tr> | |
| 31 <tr><td align=center> <img src="PCA_TE_all_rep.png" width=500 height=500></td> | |
| 32 <td align=center> <img src="PCA_PE_all_rep.png" width=500 height=500></td></tr></table> | |
| 33 <hr/><h2 id="corr_data"><font color=#ff0000>CORRELATION</font></h2> | |
| 34 <br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Scatter plot between Proteome and Transcriptome Abundance</font></th></tr> | |
| 35 <tr><td align=center> <img src="TE_PE_scatter.png" width=800 height=800></td> | |
| 36 <tr><td align=center> | |
| 37 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Parameter</font></th><th><font color=#ffcc33>Method 1</font></th><th><font color=#ffcc33>Method 2</font></th><th><font color=#ffcc33>Method 3</font></th></tr> | |
| 38 <tr><td>Correlation method</td><td> Pearson's product-moment correlation </td><td> Spearman's rank correlation rho </td><td> Kendall's rank correlation tau </td></tr> | |
| 39 <tr><td>Correlation coefficient</td><td> 0.1173569 </td><td> 0.1608612 </td><td> 0.1093701 </td></tr> | |
| 40 </table> | |
| 41 <font color="red">*Note that <u>correlation</u> is <u>sensitive to outliers</u> in the data. So it is important to analyze outliers/influential observations in the data.<br> Below we use <u>Cook's distance based approach</u> to identify such influential observations.</font> | |
| 42 </td></table><hr/><h2 id="regression_data"><font color=#ff0000>REGRESSION ANALYSIS</font></h2> | |
| 43 <font><h3>Linear Regression model fit between Proteome and Transcriptome data</h3></font> | |
| 44 <p>Assuming a linear relationship between Proteome and Transcriptome data, we here fit a linear regression model.</p> | |
| 45 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Parameter</font></th><th><font color=#ffcc33>Value</font></th></tr> | |
| 46 <tr><td>Formula</td><td> PE_abundance~TE_abundance </td></tr> | |
| 47 <tr><td colspan='2' align='center'> <b>Coefficients</b></td> </tr> | |
| 48 <tr><td> (Intercept) </td><td> -0.06910598 (Pvalue: 1.220723e-05 ) </td></tr> | |
| 49 <tr><td> TE_abundance </td><td> 0.1712395 (Pvalue: 4.168015e-10 ) </td></tr> | |
| 50 <tr><td colspan='2' align='center'> <b>Model parameters</b></td> </tr> | |
| 51 <tr><td>Residual standard error</td><td> 0.8363295 ( 2815 degree of freedom)</td></tr> | |
| 52 <tr><td>F-statistic</td><td> 39.31142 ( on 1 and 2815 degree of freedom)</td></tr> | |
| 53 <tr><td>R-squared</td><td> 0.01377265 </td></tr> | |
| 54 <tr><td>Adjusted R-squared</td><td> 0.0134223 </td></tr> | |
| 55 </table> | |
| 56 <font color='#ff0000'><h3>Regression and diagnostics plots</h3></font> | |
| 57 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "><tr bgcolor="#7a0019"><th> <font color='#ffcc33'><h4>1) <u>Residuals vs Fitted plot</h4></font></u></th> | |
| 58 <th><font color=#ffcc33><h4>2) <u>Normal Q-Q plot of residuals</h4></font></u></th></tr> | |
| 59 <tr><td align=center><img src="PE_TE_lm_1.png" width=600 height=600></td><td align=center><img src="PE_TE_lm_2.png" width=600 height=600></td></tr> | |
| 60 <tr><td align=center>This plot checks for linear relationship assumptions.<br>If a horizontal line is observed without any distinct patterns, it indicates a linear relationship.</td> | |
| 61 <td align=center>This plot checks whether residuals are normally distributed or not.<br>It is good if the residuals points follow the straight dashed line i.e., do not deviate much from dashed line.</td></tr></table> | |
| 62 <br><h2 id="inf_obs"><font color=#ff0000>Outliers based on the residuals from regression analysis</font></h2> | |
| 63 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> | |
| 64 <tr bgcolor="#7a0019"><th colspan=2><font color=#ffcc33>Residuals from Regression</font></th></tr> | |
| 65 <tr bgcolor="#7a0019"><th><font color=#ffcc33>Parameter</font></th><th><font color=#ffcc33>Value</font></th></tr> | |
| 66 <tr><td>Mean Residual value</td><td> 1.942328e-17 </td></tr> | |
| 67 <tr><td>Standard deviation (Residuals)</td><td> 0.836181 </td></tr> | |
| 68 <tr><td>Total outliers (Residual value > 2 standard deviation from the mean)</td><td> 164 <font size=4>(<b><a href=PE_TE_outliers_residuals.txt target="_blank">Download these 164 data points with high residual values here</a></b>)</font></td> | |
| 69 <tr><td colspan=2 align=center><font size=4>(<b><a href=PE_TE_abundance_residuals.txt target="_blank">Download the complete residuals data here</a></b>)</font></td></td> | |
| 70 </table><br><br> | |
| 71 <br><br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "><tr bgcolor="#7a0019"><th><font color=#ffcc33><h4>3) <u>Residuals vs Leverage plot</h4></font></u></th></tr> | |
| 72 <tr><td align=center><img src="PE_TE_lm_5.png" width=600 height=600></td></tr> | |
| 73 <tr><td align=center>This plot is useful to identify any influential cases, that is outliers or extreme values.<br>They might influence the regression results upon inclusion or exclusion from the analysis.</td></tr></table><br> | |
| 74 <hr/><h2 id="inf_obs"><font color=#ff0000>INFLUENTIAL OBSERVATIONS</font></h2> | |
| 75 <p><b>Cook's distance</b> computes the influence of each data point/observation on the predicted outcome. i.e. this measures how much the observation is influencing the fitted values.<br>In general use, those observations that have a <b>Cook's distance > than 4 times the mean</b> may be classified as <b>influential.</b></p> | |
| 76 <img src="PE_TE_lm_cooksd.png" width=800 height=800> <br>In the above plot, observations above red line ( 4 * mean Cook's distance) are influential. Genes that are outliers could be important. These observations influences the correlation values and regression coefficients<br><br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Parameter</font></th><th><font color=#ffcc33>Value</font></th></tr> | |
| 77 <tr><td>Mean Cook's distance</td><td> 0.0004875011 </td></tr> | |
| 78 <tr><td>Total influential observations (Cook's distance > 4 * mean Cook's distance)</td><td> 115 </td> | |
| 79 <tr><td>Observations with Cook's distance < 4 * mean Cook's distance</td><td> 2702 </td> | |
| 80 </table><br><br> | |
| 81 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Scatterplot: Before removal</font></th><th><font color=#ffcc33>Scatterplot: After removal</font></th></tr> | |
| 82 <tr><td align=center><!--<font color='#ff0000'><h3>Scatter plot between Proteome and Transcriptome Abundance</h3></font> | |
| 83 --> <img src="TE_PE_scatter.png" width=600 height=600></td> | |
| 84 <td align=center> | |
| 85 <img src="AbundancePlot_scatter_without_outliers.png" width=600 height=600></td></tr> | |
| 86 <tr><td> | |
| 87 <table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Parameter</font></th><th><font color=#ffcc33>Method 1</font></th><th><font color=#ffcc33>Method 2</font></th><th><font color=#ffcc33>Method 3</font></th></tr> | |
| 88 <tr><td>Correlation method</td><td> Pearson's product-moment correlation </td><td> Spearman's rank correlation rho </td><td> Kendall's rank correlation tau </td></tr> | |
| 89 <tr><td>Correlation coefficient</td><td> 0.1173569 </td><td> 0.1608612 </td><td> 0.1093701 </td></tr> | |
| 90 </table> | |
| 91 </td> | |
| 92 <td><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Parameter</font></th><th><font color=#ffcc33>Method 1</font></th><th><font color=#ffcc33>Method 2</font></th><th><font color=#ffcc33>Method 3</font></th></tr> | |
| 93 <tr><td>Correlation method</td><td> Pearson's product-moment correlation </td><td> Spearman's rank correlation rho </td><td> Kendall's rank correlation tau </td></tr> | |
| 94 <tr><td>Correlation coefficient</td><td> 0.1334038 </td><td> 0.1611936 </td><td> 0.1082761 </td></tr> | |
| 95 </table></td></tr></table> | |
| 96 <br><br><font size=5><b><a href='PE_TE_influential_observation.txt' target='_blank'>Download the complete list of influential observations</a></b></font> <font size=5><b><a href='PE_TE_non_influential_observation.txt' target='_blank'>Download the complete list (After removing influential points)</a></b></font><br> | |
| 97 <br><font color="brown"><h4>Top 10 Influential observations (Cook's distance > 4 * mean Cook's distance)</h4></font> | |
| 98 <table border=1 cellspacing=0 cellpadding=5> <tr bgcolor="#7a0019"> | |
| 99 <th><font color=#ffcc33>Gene</font></th><th><font color=#ffcc33>Protein Log Fold-Change</font></th><th><font color=#ffcc33>Transcript Log Fold-Change</font></th><th><font color=#ffcc33>Cook's Distance</font></th></tr> | |
| 100 <tr> <td> CATHL2 </td> | |
| 101 <td> -1.960863 </td> | |
| 102 <td> 4.88565 </td> | |
| 103 <td> 0.1432189 </td></tr> | |
| 104 <tr> <td> CD177 </td> | |
| 105 <td> -4.173263 </td> | |
| 106 <td> 2.057499 </td> | |
| 107 <td> 0.06826605 </td></tr> | |
| 108 <tr> <td> CATHL1 </td> | |
| 109 <td> -0.9912973 </td> | |
| 110 <td> 4.835209 </td> | |
| 111 <td> 0.05767091 </td></tr> | |
| 112 <tr> <td> HP </td> | |
| 113 <td> 2.570727 </td> | |
| 114 <td> 3.885549 </td> | |
| 115 <td> 0.04680496 </td></tr> | |
| 116 <tr> <td> AZU1 </td> | |
| 117 <td> -2.226356 </td> | |
| 118 <td> -5.561874 </td> | |
| 119 <td> 0.03737565 </td></tr> | |
| 120 <tr> <td> ELANE </td> | |
| 121 <td> -2.732479 </td> | |
| 122 <td> -2.914936 </td> | |
| 123 <td> 0.03266198 </td></tr> | |
| 124 <tr> <td> PYGM </td> | |
| 125 <td> -0.06079228 </td> | |
| 126 <td> 6.071712 </td> | |
| 127 <td> 0.03242859 </td></tr> | |
| 128 <tr> <td> LTF </td> | |
| 129 <td> -2.4294 </td> | |
| 130 <td> 2.129742 </td> | |
| 131 <td> 0.02725017 </td></tr> | |
| 132 <tr> <td> ATP1A2 </td> | |
| 133 <td> 0.2871971 </td> | |
| 134 <td> 6.446299 </td> | |
| 135 <td> 0.01939256 </td></tr> | |
| 136 <tr> <td> C13H20orf194 </td> | |
| 137 <td> -5.640732 </td> | |
| 138 <td> -0.6697401 </td> | |
| 139 <td> 0.01852927 </td></tr> | |
| 140 </table><br><br> | |
| 141 <hr/><h2 id="cluster_data"><font color=#ff0000>CLUSTER ANALYSIS</font></h2> | |
| 142 <br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>Heatmap of PE and TE abundance values (Hierarchical clustering)</font></th><th><font color=#ffcc33>Number of clusters to extract: 5 </font></th></tr> | |
| 143 <tr><td align=center colspan="2"><img src="PE_TE_heatmap.png" width=800 height=800></td></tr> | |
| 144 <tr><td colspan="2" align=center><font size=5><a href="PE_TE_hc_clusterpoints.txt" target="_blank"><b>Download the hierarchical cluster list</b></a></font></td></tr></table> | |
| 145 <br><br><table border=1 cellspacing=0 cellpadding=5 style="table-layout:auto; "> <tr bgcolor="#7a0019"><th><font color=#ffcc33>K-mean clustering</font></th><th><font color=#ffcc33>Number of clusters: 4 </font></th></tr> | |
| 146 <tr><td colspan="2" align=center><img src="PE_TE_kmeans.png" width=800 height=800></td></tr> | |
| 147 <tr><td colspan="2" align=center><font size=5><a href="PE_TE_kmeans_clusterpoints.txt" target="_blank"><b>Download the cluster list</b></a></font></td></tr></table><br><hr/> | |
| 148 <h3>Go To:</h3> | |
| 149 <ul> | |
| 150 <li><a href=#sample_dist>Sample distribution</a></li> | |
| 151 <li><a href=#corr_data>Correlation</a></li> | |
| 152 <li><a href=#regression_data>Regression analysis</a></li> | |
| 153 <li><a href=#inf_obs>Influential observations</a></li> | |
| 154 <li><a href=#cluster_data>Cluster analysis</a></li></ul> | |
| 155 <br><a href=#>TOP</a></body></html> |
