Commit message:
planemo upload commit a2411926bebc2ca3bb31215899a9f18a67e59556 |
added:
LICENSE static/images/ChangeDatatype.pdf static/images/descriptive_stat_all.png static/images/descriptive_stat_boxplot.png static/images/descriptive_stat_density.png static/images/descriptive_stat_histo.png static/images/descriptive_stat_maplot.png static/images/descriptive_stat_pairs.png static/images/input_count_file.png summary_statistics.R summary_statistics.xml summary_statistics_galaxy.R test-data/decathlon.tsv test-data/graph_file test-data/log_file test-data/table_file |
b |
diff -r 000000000000 -r 46ddb0591d8b LICENSE --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/LICENSE Thu Jan 18 07:44:37 2018 -0500 |
b |
b'@@ -0,0 +1,674 @@\n+ GNU GENERAL PUBLIC LICENSE\n+ Version 3, 29 June 2007\n+\n+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>\n+ Everyone is permitted to copy and distribute verbatim copies\n+ of this license document, but changing it is not allowed.\n+\n+ Preamble\n+\n+ The GNU General Public License is a free, copyleft license for\n+software and other kinds of works.\n+\n+ The licenses for most software and other practical works are designed\n+to take away your freedom to share and change the works. By contrast,\n+the GNU General Public License is intended to guarantee your freedom to\n+share and change all versions of a program--to make sure it remains free\n+software for all its users. We, the Free Software Foundation, use the\n+GNU General Public License for most of our software; it applies also to\n+any other work released this way by its authors. You can apply it to\n+your programs, too.\n+\n+ When we speak of free software, we are referring to freedom, not\n+price. Our General Public Licenses are designed to make sure that you\n+have the freedom to distribute copies of free software (and charge for\n+them if you wish), that you receive source code or can get it if you\n+want it, that you can change the software or use pieces of it in new\n+free programs, and that you know you can do these things.\n+\n+ To protect your rights, we need to prevent others from denying you\n+these rights or asking you to surrender the rights. Therefore, you have\n+certain responsibilities if you distribute copies of the software, or if\n+you modify it: responsibilities to respect the freedom of others.\n+\n+ For example, if you distribute copies of such a program, whether\n+gratis or for a fee, you must pass on to the recipients the same\n+freedoms that you received. You must make sure that they, too, receive\n+or can get the source code. And you must show them these terms so they\n+know their rights.\n+\n+ Developers that use the GNU GPL protect your rights with two steps:\n+(1) assert copyright on the software, and (2) offer you this License\n+giving you legal permission to copy, distribute and/or modify it.\n+\n+ For the developers\' and authors\' protection, the GPL clearly explains\n+that there is no warranty for this free software. For both users\' and\n+authors\' sake, the GPL requires that modified versions be marked as\n+changed, so that their problems will not be attributed erroneously to\n+authors of previous versions.\n+\n+ Some devices are designed to deny users access to install or run\n+modified versions of the software inside them, although the manufacturer\n+can do so. This is fundamentally incompatible with the aim of\n+protecting users\' freedom to change the software. The systematic\n+pattern of such abuse occurs in the area of products for individuals to\n+use, which is precisely where it is most unacceptable. Therefore, we\n+have designed this version of the GPL to prohibit the practice for those\n+products. If such problems arise substantially in other domains, we\n+stand ready to extend this provision to those domains in future versions\n+of the GPL, as needed to protect the freedom of users.\n+\n+ Finally, every program is threatened constantly by software patents.\n+States should not allow patents to restrict development and use of\n+software on general-purpose computers, but in those that do, we wish to\n+avoid the special danger that patents applied to a free program could\n+make it effectively proprietary. To prevent this, the GPL assures that\n+patents cannot be used to render the program non-free.\n+\n+ The precise terms and conditions for copying, distribution and\n+modification follow.\n+\n+ TERMS AND CONDITIONS\n+\n+ 0. Definitions.\n+\n+ "This License" refers to version 3 of the GNU General Public License.\n+\n+ "Copyright" also means copyright-like laws that apply to other kinds of\n+works, such as semiconductor masks.\n+\n+ "The Program" refers to a'..b'CE OF THE PROGRAM\n+IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF\n+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n+\n+ 16. Limitation of Liability.\n+\n+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING\n+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS\n+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY\n+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE\n+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF\n+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD\n+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),\n+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF\n+SUCH DAMAGES.\n+\n+ 17. Interpretation of Sections 15 and 16.\n+\n+ If the disclaimer of warranty and limitation of liability provided\n+above cannot be given local legal effect according to their terms,\n+reviewing courts shall apply local law that most closely approximates\n+an absolute waiver of all civil liability in connection with the\n+Program, unless a warranty or assumption of liability accompanies a\n+copy of the Program in return for a fee.\n+\n+ END OF TERMS AND CONDITIONS\n+\n+ How to Apply These Terms to Your New Programs\n+\n+ If you develop a new program, and you want it to be of the greatest\n+possible use to the public, the best way to achieve this is to make it\n+free software which everyone can redistribute and change under these terms.\n+\n+ To do so, attach the following notices to the program. It is safest\n+to attach them to the start of each source file to most effectively\n+state the exclusion of warranty; and each file should have at least\n+the "copyright" line and a pointer to where the full notice is found.\n+\n+ {one line to give the program\'s name and a brief idea of what it does.}\n+ Copyright (C) {year} {name of author}\n+\n+ This program is free software: you can redistribute it and/or modify\n+ it under the terms of the GNU General Public License as published by\n+ the Free Software Foundation, either version 3 of the License, or\n+ (at your option) any later version.\n+\n+ This program is distributed in the hope that it will be useful,\n+ but WITHOUT ANY WARRANTY; without even the implied warranty of\n+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n+ GNU General Public License for more details.\n+\n+ You should have received a copy of the GNU General Public License\n+ along with this program. If not, see <http://www.gnu.org/licenses/>.\n+\n+Also add information on how to contact you by electronic and paper mail.\n+\n+ If the program does terminal interaction, make it output a short\n+notice like this when it starts in an interactive mode:\n+\n+ {project} Copyright (C) {year} {fullname}\n+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w\'.\n+ This is free software, and you are welcome to redistribute it\n+ under certain conditions; type `show c\' for details.\n+\n+The hypothetical commands `show w\' and `show c\' should show the appropriate\n+parts of the General Public License. Of course, your program\'s commands\n+might be different; for a GUI interface, you would use an "about box".\n+\n+ You should also get your employer (if you work as a programmer) or school,\n+if any, to sign a "copyright disclaimer" for the program, if necessary.\n+For more information on this, and how to apply and follow the GNU GPL, see\n+<http://www.gnu.org/licenses/>.\n+\n+ The GNU General Public License does not permit incorporating your program\n+into proprietary programs. If your program is a subroutine library, you\n+may consider it more useful to permit linking proprietary applications with\n+the library. If this is what you want to do, use the GNU Lesser General\n+Public License instead of this License. But first, please read\n+<http://www.gnu.org/philosophy/why-not-lgpl.html>.\n' |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/ChangeDatatype.pdf |
b |
Binary file static/images/ChangeDatatype.pdf has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/descriptive_stat_all.png |
b |
Binary file static/images/descriptive_stat_all.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/descriptive_stat_boxplot.png |
b |
Binary file static/images/descriptive_stat_boxplot.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/descriptive_stat_density.png |
b |
Binary file static/images/descriptive_stat_density.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/descriptive_stat_histo.png |
b |
Binary file static/images/descriptive_stat_histo.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/descriptive_stat_maplot.png |
b |
Binary file static/images/descriptive_stat_maplot.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/descriptive_stat_pairs.png |
b |
Binary file static/images/descriptive_stat_pairs.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b static/images/input_count_file.png |
b |
Binary file static/images/input_count_file.png has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b summary_statistics.R --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/summary_statistics.R Thu Jan 18 07:44:37 2018 -0500 |
[ |
b'@@ -0,0 +1,263 @@\n+###########################################################################\n+# Quality controls and descriptive analysis plots #\n+###########################################################################\n+# Authors: Melanie Petera #\n+###########################################################################\n+# Description : This script allows various displays of data for quality #\n+# control and descriptive analysis. The input data is a matrix of #\n+# quantitative variables, and it returns chosen plots in png format #\n+# and a table with chosen statistics. #\n+###########################################################################\n+# Specific R packages: #\n+# - edgeR (needed for MA plots) #\n+###########################################################################\n+# Version 1 (06-06-2014): display boxplot, histogram, density plot, #\n+# MA plot, pairs plot, and return a table of chosen statistics #\n+# (quantiles, mean, variance, standard error of the mean) #\n+###########################################################################\n+ \n+desc_fct <- function(file.in, nacode, table_file, graph_file, stat, chosen.stat, ploting, chosen.plot, log_file){\n+ # Parameters:\n+ # - file.in: count matrix input (tab-separated) [file name]\n+ # - nacode: missing value coding character\n+ # - table_file: results file containing table of chosen statistics [file name]\n+ # - graph_file: pdf file containing plots for chosen statistics [file name]\n+ # - stat: should statistics be calculated? (TRUE/FALSE)\n+ # - chosen.stat: character listing the chosen statistics (comma-separated)\n+ # - ploting: should graphics be displayed? (TRUE/FALSE)\n+ # - chosen.plot: character listing the chosen plots (comma-separated)\n+ # - log_file: a log file [file name]\n+\n+\n+##########################################################\n+# Read and verify data - - - - - - - - - - - - \n+# Checks valids for all modules\n+\n+library(methods)\n+\n+log_error=function(message="") {\n+\tline_use="line"\n+\tcolumn_use="column"\n+\n+\tcat("<HTML><HEAD><TITLE>Normalization report</TITLE></HEAD><BODY>\\n",file=log_file,append=F,sep="")\n+\tcat("⚠ An error occurred while trying to read your table.\\n<BR>",file=log_file,append=T,sep="")\n+\tcat("Please check that:\\n<BR>",file=log_file,append=T,sep="")\n+\tcat("<UL>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> the table you want to process contains the same number of columns for each line</LI>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> the first line of your table is a header line (specifying the name of each ",column_use,")</LI>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> the first column of your table specifies the name of each ",line_use,"</LI>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> both individual and variable names should be unique</LI>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> each value is separated from the other by a <B>TAB</B> character</LI>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> except for first line and first column, table should contain a numeric value</LI>\\n",file=log_file,append=T,sep="")\n+\tcat(" <LI> this value may contain character \'.\' as decimal separator or \'",nacode,"\' for missing values</LI>\\n",file=log_file,append=T,sep="")\n+\tcat("</UL>\\n",file=log_file,append=T,sep="")\n+\tcat("-------<BR>\\nError messages recieved:<BR><FONT color=red>\\n",conditionMessage(message),"</FONT>\\n",file=log_file,append=T,sep="")\n+\tcat("</BODY></HTML>\\n",file=log_file,append=T,sep="")\n+\tq(save="no",status=1)\n+}\n+\n+tab_in=tryCatch(\n+\t{\n+\t\ttab_in=read.table(file.in,header=TRUE,na.strings=nacode,sep="\\t",check.names=FALSE,quote="\\"")\n+\t},\n+\terror=function(cond) {\n+\t\tlog_error(message=cond)\n+\t\treturn(NA)\n+\t},\n+\twarning=function(cond) {\n'..b'],2,sd,na.rm=TRUE)\n+ stat.res <- cbind(stat.res,c("Std.Dev",round(colSd,digits=numdig)))\n+ } \n+ \n+ if("variance" %in% stat.list){\n+ colVar <- apply(Dataset[,-1],2,var,na.rm=TRUE)\n+ stat.res <- cbind(stat.res,c("Variance",round(colVar,digits=numdig)))\n+ }\n+ \n+ if(("median" %in% stat.list)&&(!("quartile" %in% stat.list))){\n+ colMed <- apply(Dataset[,-1],2,median,na.rm=TRUE)\n+ stat.res <- cbind(stat.res,c("Median",round(colMed,digits=numdig)))\n+ }\n+ \n+ if("quartile" %in% stat.list){\n+ colQ <- round(apply(Dataset[,-1],2,quantile,na.rm=TRUE),digits=numdig)\n+ stat.res <- cbind(stat.res,c("Min",colQ[1,]),c("Q1",colQ[2,]),\n+ c("Median",colQ[3,]),c("Q3",colQ[4,]),c("Max",colQ[5,]))\n+ }\n+ \n+ if("decile" %in% stat.list){\n+ colD <- round(t(apply(Dataset[,-1],2,quantile,na.rm=TRUE,seq(0,1,0.1))),digits=numdig)\n+ colD <- rbind(paste("D",seq(0,10,1),sep=""),colD)\n+ stat.res <- cbind(stat.res,colD)\n+ }\n+ \n+ write.table(stat.res,table_file,col.names=FALSE,sep="\\t",quote=FALSE)\n+\n+ log=paste(log,"➔ You choose to compute :",chosen.stat,"<BR>")\n+ \n+} # end if(stat)\n+else{\n+ log=paste(log,"➔ You don\'t choose any stats<BR>")\n+}\n+\n+##########################################################\n+# Graphics generation - - - - - - - - - - - - - \n+\n+if(ploting=="T" & length(chosen.plot)!=0){\n+ \n+ nb_graph_per_row=4\n+ nb_graph=ncol(Dataset)-1\n+\n+ nb_row=round(nb_graph/nb_graph_per_row)\n+\n+ nb_empty_plot=nb_graph %% nb_graph_per_row\n+ if (nb_empty_plot != 0) {\n+\tnb_row=nb_row+1\n+ }\n+\n+ page_height=3.5 * nb_row\n+\t\n+ pdf(file=graph_file,height=page_height)\n+\n+ graph.list <- strsplit(chosen.plot,",")[[1]]\n+\n+ #For the pair plot, we stick to the default layout\n+ if("pairsplot" %in% graph.list){\n+ pairs(Dataset[,-1])\n+ }\n+\n+ #For the other plots, we have 4 plots per line\n+ par(mfrow=c(nb_row,nb_graph_per_row),mar=c(3, 3, 3, 1) + 0.1)\n+ \n+ if("boxplot" %in% graph.list){\n+ for(ech in 2:ncol(Dataset)){\n+ boxplot(Dataset[,ech],main=colnames(Dataset)[ech],xlab=NULL)\n+ }\n+ #Complete page with empty plots\n+ i=0; while (i<nb_empty_plot) {plot.new();i=i+1;}\n+ }\n+ \n+ if("histogram" %in% graph.list){\n+ for(ech in 2:ncol(Dataset)){\n+ hist(Dataset[,ech],main=colnames(Dataset)[ech],xlab=NULL)\n+ }\n+ #Complete page with empty plots\n+ i=0; while (i<nb_empty_plot) {plot.new();i=i+1;}\n+ }\n+ \n+ if("density" %in% graph.list){\n+ for(ech in 2:ncol(Dataset)){\n+ plot(density(Dataset[,ech],na.rm=TRUE),main=colnames(Dataset)[ech])\n+ }\n+ #Complete page with empty plots\n+ i=0; while (i<nb_empty_plot) {plot.new();i=i+1;}\n+ }\n+ \n+ \n+ if("MAplot" %in% graph.list){\n+ if(min(Dataset[,-1],na.rm=TRUE)<0){\n+\t cat("\\n----\\nError: MAplot only available for positive variables\\n----",file=log_file,append=T,sep="")\n+\t q(save="no",status=1)\n+\t}\n+ library(limma)\n+\n+ library(edgeR) #Warning : Import also limma package\n+ for(ech in 2:(ncol(Dataset)-1)){\n+ for(ech2 in (ech+1):ncol(Dataset)){\n+ temp.pair <- na.omit(Dataset[,c(ech,ech2)])\n+ maPlot(temp.pair[,1],temp.pair[,2],main=paste(colnames(Dataset)[ech],"VS",colnames(Dataset)[ech2]))\n+ }\n+ }\n+ #Do not complete page with empty plots for this plot because it generates nb_variables X nb_variables graphs\n+ }\n+\n+ #Close pdf device\n+ dev.off()\n+ \n+ log=paste(log,"➔ You choose to plot :",chosen.plot,"<BR>")\n+} # end if(ploting)\n+else{\n+ log=paste(log,"➔ You don\'t choose any plot<BR>")\n+}\n+\n+\n+\n+##########################################################\n+# Treatment successfull\n+##########################################################\n+cat("<HTML><HEAD><TITLE>Summary statistics report</TITLE></HEAD><BODY>\\n",file=log_file,append=F,sep="")\n+cat(log,file=log_file,append=T,sep="")\n+cat("✓ Your process is successfull!<BR>",file=log_file,append=T,sep="")\n+cat("</BODY></HTML>\\n",file=log_file,append=T,sep="")\n+\n+\n+} # end of function\n+\n+\n' |
b |
diff -r 000000000000 -r 46ddb0591d8b summary_statistics.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/summary_statistics.xml Thu Jan 18 07:44:37 2018 -0500 |
[ |
b'@@ -0,0 +1,280 @@\n+<!--# Copyright (C) 2017 INRA\n+# This program is free software: you can redistribute it and/or modify\n+# it under the terms of the GNU General Public License as published by\n+# the Free Software Foundation, either version 3 of the License, or\n+# (at your option) any later version.\n+#\n+# This program is distributed in the hope that it will be useful,\n+# but WITHOUT ANY WARRANTY; without even the implied warranty of\n+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n+# GNU General Public License for more details.\n+# \n+# You should have received a copy of the GNU General Public License\n+# along with this program. If not, see http://www.gnu.org/licenses/.\n+#-->\n+\n+<tool id="summary_statistics" name="Summary statistics" version="1.0.0">\n+ <description>Produce simple descriptive statistics from a numerical table</description>\n+ <requirements>\n+ <requirement type="package">R</requirement>\n+ <requirement type="package">bioconductor-edger</requirement>\n+ <requirement type="package">bioconductor-limma</requirement>\n+ <requirement type="package">r-batch</requirement>\n+ <requirement type="package">r-locfit</requirement>\n+ </requirements>\n+ <stdio>\n+ <!-- Anything other than zero is an error -->\n+ <exit_code range="1:" level="fatal" />\n+ <exit_code range=":-1" level="fatal" />\n+ </stdio>\n+ <command interpreter="Rscript"><![CDATA[\n+ summary_statistics_galaxy.R\n+ file_in \'${file_in}\'\n+ NA_code \'${NA_code}\'\n+ stat \'${stat_cond.stat}\'\n+ #if $stat_cond.stat =="T":\n+ stat_chosen \'${stat_cond.stat_chosen}\'\n+ #end if\n+ ploting \'${ploting_cond.ploting}\'\n+ #if $ploting_cond.ploting =="T":\n+ plot_chosen \'${ploting_cond.plot_chosen}\'\n+ #end if\n+ table_file \'${table_file}\'\n+ graph_file \'${graph_file}\'\n+ log_file \'${log_file}\'\n+ ]]></command>\n+ <inputs>\n+ <param format="csv,tabular" name="file_in" type="data" label="Input File" />\n+ <param name="NA_code" size="30" type="text" value="NA" label="Label used for Missing values" />\n+ <conditional name="stat_cond">\n+ <param name="stat" type="select" help="Do you want to compute some basic statistics?" label="Statistics table">\n+ <option value="T">Yes</option>\n+ <option value="F" selected="true">No</option>\n+ </param>\n+ <when value="T">\n+ <param name="stat_chosen" type="select" display="checkboxes" multiple="True" label="Chosen statistic(s)">\n+ <option value="mean">mean</option>\n+ <option value="sd">sd</option>\n+ <option value="variance">variance</option>\n+ <option value="median">median</option>\n+ <option value="quartile">quartile</option>\n+ <option value="decile">decile</option>\n+ <validator type="empty_field" message="Please choose at least one statistic representation" />\n+ </param>\n+ </when>\n+ </conditional>\n+ <conditional name="ploting_cond">\n+ <param name="ploting" type="select" help="Do you want some standard plots?" label="Plots">\n+ <option value="T">Yes</option>\n+ <option value="F" selected="true">No</option>\n+ </param>\n+ <when value="T">\n+ <param name="plot_chosen" type="select" help="" display="checkboxes" multiple="True" label="Chosen plot(s)">\n+ <option value="boxplot">boxplot</option>\n+ <option value="histogram">histogram</option>\n+ <option value="density">density</option>\n+ <option value="pairsplot">pairsplot</option>\n+ <option value="MAplot">MAplot</option>\n+ <validator type="empty_field" message="Please choose at least one p'..b'ers\n+----------\n+\n+Label used for Missing values\n+ | Missing value coding characters\n+ |\n+\n+statistics table\n+ | if YES, allow you to choose statistic(s) you want in your report\n+ |\n+\n+Chosen statistic(s)\n+ | select the statistics you want in your report (see above "Available statistics and plots")\n+ |\n+\n+Plots\n+ | if YES, allow you to choose plot(s) you want in your report\n+ |\n+Chosen plot(s)\n+ | select the plots you want in your report (see above "Available statistics and plots")\n+ |\n+\n+------------------------------\n+Available statistics and plots\n+------------------------------\n+\n+**Numerical statistical measures provided are the following ones:**\n+\n+\n+.. image:: https://raw.githubusercontent.com/IFB-ElixirFr/GFLS/master/summary_statistics/static/images/descriptive_stat_all.png\n+ :width: 500\n+\n+\n+**Available plots:**\n+\n+\n+\t* boxplot\n+\n+\n+.. image:: https://raw.githubusercontent.com/IFB-ElixirFr/GFLS/master/summary_statistics/static/images/descriptive_stat_boxplot.png\n+\n+(source : SAS documentation)\n+\n+\t* histogram\n+\n+\n+.. image:: https://raw.githubusercontent.com/IFB-ElixirFr/GFLS/master/summary_statistics/static/images/descriptive_stat_histo.png\n+ :width: 285\n+\n+In this example, about 45 values of the dataset are greater than 0 and lower than 0.5\n+\n+\t* density\n+\n+\n+.. image:: https://raw.githubusercontent.com/IFB-ElixirFr/GFLS/master/summary_statistics/static/images/descriptive_stat_density.png\n+ :width: 275\n+\n+This option computes kernel density estimates (gaussian smoothing).\n+While a histogram displays the observed distribution of a numerical variable, a density plot allows to view the estimated distribution of the theoretical continuous variable. \n+\n+\t* pairsplot\n+\n+\n+.. image:: https://raw.githubusercontent.com/IFB-ElixirFr/GFLS/master/summary_statistics/static/images/descriptive_stat_pairs.png\n+ :width: 475\n+\n+In this example, we have represented a pairs plot for a table with three columns, named \n+"a", "b" and "c". Each plot represents the values of a given column scaled on the x axis versus\n+the values of another column scaled on the y axis.\n+\n+\t* MAplot\n+\n+\n+.. image:: https://raw.githubusercontent.com/IFB-ElixirFr/GFLS/master/summary_statistics/static/images/descriptive_stat_maplot.png\n+ :width: 275\n+\n+Designed for genomic data (count data - only positive values accepted).\n+Each plot allows to compare two samples. \n+The ordinate axis (M) represents the log ratios (binary logarithm) whereas the abscissa axis (A) corresponds to the means of log values. \n+\n+\n+\n+------------\n+Output files\n+------------\n+\n+Summary_statistics_report.tsv\n+\t| contains a table with all the requested statistics\n+ |\n+ \n+Summary_statistics_report.pdf\n+\t| contains all the requested graphics\n+ | \n+\n+Summary_statistics_log\n+ |\n+ |\n+ \n+------\n+\n+**Authors** Luc Jouneau (luc.jouneau@inra.fr), M\xc3\xa9lanie P\xc3\xa9t\xc3\xa9ra (melanie.petera@inra.fr), Sarah Maman (sarah.maman@inra.fr) and Valentin Marcon (valentin.marcon@inra.fr)\n+\n+Contact : support.sigenae@inra.fr\n+\n+E-learning available : Not yet.\n+\n+- Information :\n+\n+Tool coded in the R language:\n+\t| *R Core Team. R: A language and environment for statistical computing. R*\n+\t| *Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.*\n+\n+.. class:: infomark\n+\n+-------------\n+Please cite :\n+-------------\n+\n+- (Depending on the help provided you can cite us in acknowledgements, references or both.)\n+ \n+Acknowledgements\n+ | We wish to thank SIGENAE group and the statistical CATI BIOS4Biol group : M\xc3\xa9lanie P\xc3\xa9t\xc3\xa9ra, Sarah Maman, Luc Jouneau\n+ | Re-packaging was provided by Valentin Marcon (INRA, Migale platform http://migale.jouy.inra.fr), as part of the IFB project \'Galaxy For Life Science\' (http://www.france-bioinformatique.fr/fr)\n+ | \n+ \n+References\n+ | SIGENAE [http://www.sigenae.org/]\n+\n+ ]]></help>\n+</tool>\n' |
b |
diff -r 000000000000 -r 46ddb0591d8b summary_statistics_galaxy.R --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/summary_statistics_galaxy.R Thu Jan 18 07:44:37 2018 -0500 |
[ |
@@ -0,0 +1,52 @@ +########################################################################### +# Quality controls and descriptive analysis plots # +########################################################################### +# Authors: Melanie Petera # +########################################################################### +# Description : This script allows various displays of data for quality # +# control and descriptive analysis. The input data is a matrix of # +# quantitative variables, and it returns chosen plots in png format # +# and a table with chosen statistics. # +########################################################################### +# Specific R packages: # +# - edgeR (needed for MA plots) # +########################################################################### +# Version 1 (06-06-2014): display boxplot, histogram, density plot, # +# MA plot, pairs plot, and return a table of chosen statistics # +# (quantiles, mean, variance, standard error of the mean) # +########################################################################### + + +##------------------------------ +## Libraries laoding +##------------------------------ +# For parseCommandArgs function +library(batch) + +# R script call +source_local <- function(fname) +{ + argv <- commandArgs(trailingOnly = FALSE) + base_dir <- dirname(substring(argv[grep("--file=", argv)], 8)) + source(paste(base_dir, fname, sep="/")) +} + +#Import the different functions used for Summary_Statistics +source_local("summary_statistics.R") + +##------------------------------ +## Lecture parametres +##------------------------------ +argLs <- parseCommandArgs(evaluate=FALSE) + +desc_fct(file.in=argLs[["file_in"]], +nacode=argLs[["NA_code"]], +table_file=argLs[["table_file"]], +graph_file=argLs[["graph_file"]], +stat=argLs[["stat"]], +chosen.stat=argLs[["stat_chosen"]], +ploting=argLs[["ploting"]], +chosen.plot=argLs[["plot_chosen"]], +log_file=argLs[["log_file"]]) + + |
b |
diff -r 000000000000 -r 46ddb0591d8b test-data/decathlon.tsv --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/decathlon.tsv Thu Jan 18 07:44:37 2018 -0500 |
b |
@@ -0,0 +1,42 @@ +"name" "100m" "Long.jump" "Shot.put" "High.jump" "400m" "110m.hurdle" "Discus" "Pole.vault" "Javeline" "1500m" +"SEBRLE" 11.04 7.58 14.83 2.07 49.81 14.69 43.75 5.02 63.19 291.7 +"CLAY" 10.76 7.4 14.26 1.86 49.37 14.05 50.72 4.92 60.15 301.5 +"KARPOV" 11.02 7.3 14.77 2.04 48.37 14.09 48.95 4.92 50.31 300.2 +"BERNARD" 11.02 7.23 14.25 1.92 48.93 14.99 40.87 5.32 62.77 280.1 +"YURKOV" 11.34 7.09 15.19 2.1 50.42 15.31 46.26 4.72 63.44 276.4 +"WARNERS" 11.11 7.6 14.31 1.98 48.68 14.23 41.1 4.92 51.77 278.1 +"ZSIVOCZKY" 11.13 7.3 13.48 2.01 48.62 14.17 45.67 4.42 55.37 268 +"McMULLEN" 10.83 7.31 13.76 2.13 49.91 14.38 44.41 4.42 56.37 285.1 +"MARTINEAU" 11.64 6.81 14.57 1.95 50.14 14.93 47.6 4.92 52.33 262.1 +"HERNU" 11.37 7.56 14.41 1.86 51.1 15.06 44.99 4.82 57.19 285.1 +"BARRAS" 11.33 6.97 14.09 1.95 49.48 14.48 42.1 4.72 55.4 282 +"NOOL" 11.33 7.27 12.68 1.98 49.2 15.29 37.92 4.62 57.44 266.6 +"BOURGUIGNON" 11.36 6.8 13.46 1.86 51.16 15.67 40.49 5.02 54.68 291.7 +"Sebrle" 10.85 7.84 16.36 2.12 48.36 14.05 48.72 5 70.52 280.01 +"Clay" 10.44 7.96 15.23 2.06 49.19 14.13 50.11 4.9 69.71 282 +"Karpov" 10.5 7.81 15.93 2.09 46.81 13.97 51.65 4.6 55.54 278.11 +"Macey" 10.89 7.47 15.73 2.15 48.97 14.56 48.34 4.4 58.46 265.42 +"Warners" 10.62 7.74 14.48 1.97 47.97 14.01 43.73 4.9 55.39 278.05 +"Zsivoczky" 10.91 7.14 15.31 2.12 49.4 14.95 45.62 4.7 63.45 269.54 +"Hernu" 10.97 7.19 14.65 2.03 48.73 14.25 44.72 4.8 57.76 264.35 +"Nool" 10.8 7.53 14.26 1.88 48.81 14.8 42.05 5.4 61.33 276.33 +"Bernard" 10.69 7.48 14.8 2.12 49.13 14.17 44.75 4.4 55.27 276.31 +"Schwarzl" 10.98 7.49 14.01 1.94 49.76 14.25 42.43 5.1 56.32 273.56 +"Pogorelov" 10.95 7.31 15.1 2.06 50.79 14.21 44.6 5 53.45 287.63 +"Schoenbeck" 10.9 7.3 14.77 1.88 50.3 14.34 44.41 5 60.89 278.82 +"Barras" 11.14 6.99 14.91 1.94 49.41 14.37 44.83 4.6 64.55 267.09 +"Smith" 10.85 6.81 15.24 1.91 49.27 14.01 49.02 4.2 61.52 272.74 +"Averyanov" 10.55 7.34 14.44 1.94 49.72 14.39 39.88 4.8 54.51 271.02 +"Ojaniemi" 10.68 7.5 14.97 1.94 49.12 15.01 40.35 4.6 59.26 275.71 +"Smirnov" 10.89 7.07 13.88 1.94 49.11 14.77 42.47 4.7 60.88 263.31 +"Qi" 11.06 7.34 13.55 1.97 49.65 14.78 45.13 4.5 60.79 272.63 +"Drews" 10.87 7.38 13.07 1.88 48.51 14.01 40.11 5 51.53 274.21 +"Parkhomenko" 11.14 6.61 15.69 2.03 51.04 14.88 41.9 4.8 65.82 277.94 +"Terek" 10.92 6.94 15.15 1.94 49.56 15.12 45.62 5.3 50.62 290.36 +"Gomez" 11.08 7.26 14.57 1.85 48.61 14.41 40.95 4.4 60.71 269.7 +"Turi" 11.08 6.91 13.62 2.03 51.67 14.26 39.83 4.8 59.34 290.01 +"Lorenzo" 11.1 7.03 13.22 1.85 49.34 15.38 40.22 4.5 58.36 263.08 +"Karlivans" 11.33 7.26 13.3 1.97 50.54 14.98 43.34 4.5 52.92 278.67 +"Korkizoglou" 10.86 7.07 14.81 1.94 51.16 14.96 46.07 4.7 53.05 317 +"Uldal" 11.23 6.99 13.53 1.85 50.95 15.09 43.01 4.5 60 281.7 +"Casarsa" 11.36 6.68 14.92 1.94 53.2 15.39 48.66 4.4 58.62 296.12 |
b |
diff -r 000000000000 -r 46ddb0591d8b test-data/graph_file |
b |
Binary file test-data/graph_file has changed |
b |
diff -r 000000000000 -r 46ddb0591d8b test-data/log_file --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/log_file Thu Jan 18 07:44:37 2018 -0500 |
b |
@@ -0,0 +1,2 @@ +<HTML><HEAD><TITLE>Summary statistics report</TITLE></HEAD><BODY> + ➔ You choose to compute : mean,sd,variance,median,quartile,decile <BR> ➔ You choose to plot : boxplot,histogram,density,pairsplot,MAplot <BR>✓ Your process is successfull!<BR></BODY></HTML> |
b |
diff -r 000000000000 -r 46ddb0591d8b test-data/table_file --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/table_file Thu Jan 18 07:44:37 2018 -0500 |
b |
@@ -0,0 +1,11 @@ +name Mean Std.Dev Variance Min Q1 Median Q3 Max D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 +100m 10.99805 0.26302 0.06918 10.44 10.85 10.98 11.14 11.64 10.44 10.68 10.83 10.87 10.91 10.98 11.06 11.11 11.23 11.34 11.64 +Long.jump 7.26 0.3164 0.10011 6.61 7.03 7.3 7.48 7.96 6.61 6.81 6.99 7.07 7.23 7.3 7.31 7.4 7.5 7.6 7.96 +Shot.put 14.47707 0.82443 0.67968 12.68 13.88 14.57 14.97 16.36 12.68 13.46 13.62 14.09 14.31 14.57 14.77 14.91 15.15 15.31 16.36 +High.jump 1.97683 0.08895 0.00791 1.85 1.92 1.95 2.04 2.15 1.85 1.86 1.88 1.94 1.94 1.95 1.98 2.03 2.06 2.12 2.15 +400m 49.61634 1.15345 1.33045 46.81 48.93 49.4 50.3 53.2 46.81 48.51 48.73 49.11 49.2 49.4 49.65 49.91 50.54 51.1 53.2 +110m.hurdle 14.60585 0.47179 0.22258 13.97 14.21 14.48 14.98 15.67 13.97 14.05 14.17 14.25 14.37 14.48 14.78 14.95 15.01 15.29 15.67 +Discus 44.32561 3.37784 11.40984 37.92 41.9 44.41 46.07 51.65 37.92 40.22 40.95 42.1 43.34 44.41 44.83 45.62 47.6 48.95 51.65 +Pole.vault 4.76244 0.278 0.07728 4.2 4.5 4.8 4.92 5.4 4.2 4.4 4.5 4.6 4.7 4.8 4.82 4.92 5 5.02 5.4 +Javeline 58.31659 4.82682 23.29819 50.31 55.27 58.36 60.89 70.52 50.31 52.33 54.51 55.39 56.37 58.36 59.34 60.79 61.52 63.45 70.52 +1500m 279.02488 11.67325 136.2647 262.1 271.02 278.05 285.1 317 262.1 265.42 269.54 272.74 276.31 278.05 278.82 282 287.63 291.7 317 |