Commit message:
planemo upload commit a2411926bebc2ca3bb31215899a9f18a67e59556 |
added:
LICENSE normalization.R normalization.xml normalization_galaxy.R test-data/decathlon.tsv test-data/log_file test-data/output_file |
b |
diff -r 000000000000 -r 79f00bc83ecc LICENSE --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/LICENSE Thu Jan 18 06:20:30 2018 -0500 |
b |
b'@@ -0,0 +1,674 @@\n+ GNU GENERAL PUBLIC LICENSE\n+ Version 3, 29 June 2007\n+\n+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>\n+ Everyone is permitted to copy and distribute verbatim copies\n+ of this license document, but changing it is not allowed.\n+\n+ Preamble\n+\n+ The GNU General Public License is a free, copyleft license for\n+software and other kinds of works.\n+\n+ The licenses for most software and other practical works are designed\n+to take away your freedom to share and change the works. By contrast,\n+the GNU General Public License is intended to guarantee your freedom to\n+share and change all versions of a program--to make sure it remains free\n+software for all its users. We, the Free Software Foundation, use the\n+GNU General Public License for most of our software; it applies also to\n+any other work released this way by its authors. You can apply it to\n+your programs, too.\n+\n+ When we speak of free software, we are referring to freedom, not\n+price. Our General Public Licenses are designed to make sure that you\n+have the freedom to distribute copies of free software (and charge for\n+them if you wish), that you receive source code or can get it if you\n+want it, that you can change the software or use pieces of it in new\n+free programs, and that you know you can do these things.\n+\n+ To protect your rights, we need to prevent others from denying you\n+these rights or asking you to surrender the rights. Therefore, you have\n+certain responsibilities if you distribute copies of the software, or if\n+you modify it: responsibilities to respect the freedom of others.\n+\n+ For example, if you distribute copies of such a program, whether\n+gratis or for a fee, you must pass on to the recipients the same\n+freedoms that you received. You must make sure that they, too, receive\n+or can get the source code. And you must show them these terms so they\n+know their rights.\n+\n+ Developers that use the GNU GPL protect your rights with two steps:\n+(1) assert copyright on the software, and (2) offer you this License\n+giving you legal permission to copy, distribute and/or modify it.\n+\n+ For the developers\' and authors\' protection, the GPL clearly explains\n+that there is no warranty for this free software. For both users\' and\n+authors\' sake, the GPL requires that modified versions be marked as\n+changed, so that their problems will not be attributed erroneously to\n+authors of previous versions.\n+\n+ Some devices are designed to deny users access to install or run\n+modified versions of the software inside them, although the manufacturer\n+can do so. This is fundamentally incompatible with the aim of\n+protecting users\' freedom to change the software. The systematic\n+pattern of such abuse occurs in the area of products for individuals to\n+use, which is precisely where it is most unacceptable. Therefore, we\n+have designed this version of the GPL to prohibit the practice for those\n+products. If such problems arise substantially in other domains, we\n+stand ready to extend this provision to those domains in future versions\n+of the GPL, as needed to protect the freedom of users.\n+\n+ Finally, every program is threatened constantly by software patents.\n+States should not allow patents to restrict development and use of\n+software on general-purpose computers, but in those that do, we wish to\n+avoid the special danger that patents applied to a free program could\n+make it effectively proprietary. To prevent this, the GPL assures that\n+patents cannot be used to render the program non-free.\n+\n+ The precise terms and conditions for copying, distribution and\n+modification follow.\n+\n+ TERMS AND CONDITIONS\n+\n+ 0. Definitions.\n+\n+ "This License" refers to version 3 of the GNU General Public License.\n+\n+ "Copyright" also means copyright-like laws that apply to other kinds of\n+works, such as semiconductor masks.\n+\n+ "The Program" refers to a'..b'CE OF THE PROGRAM\n+IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF\n+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n+\n+ 16. Limitation of Liability.\n+\n+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING\n+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS\n+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY\n+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE\n+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF\n+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD\n+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),\n+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF\n+SUCH DAMAGES.\n+\n+ 17. Interpretation of Sections 15 and 16.\n+\n+ If the disclaimer of warranty and limitation of liability provided\n+above cannot be given local legal effect according to their terms,\n+reviewing courts shall apply local law that most closely approximates\n+an absolute waiver of all civil liability in connection with the\n+Program, unless a warranty or assumption of liability accompanies a\n+copy of the Program in return for a fee.\n+\n+ END OF TERMS AND CONDITIONS\n+\n+ How to Apply These Terms to Your New Programs\n+\n+ If you develop a new program, and you want it to be of the greatest\n+possible use to the public, the best way to achieve this is to make it\n+free software which everyone can redistribute and change under these terms.\n+\n+ To do so, attach the following notices to the program. It is safest\n+to attach them to the start of each source file to most effectively\n+state the exclusion of warranty; and each file should have at least\n+the "copyright" line and a pointer to where the full notice is found.\n+\n+ {one line to give the program\'s name and a brief idea of what it does.}\n+ Copyright (C) {year} {name of author}\n+\n+ This program is free software: you can redistribute it and/or modify\n+ it under the terms of the GNU General Public License as published by\n+ the Free Software Foundation, either version 3 of the License, or\n+ (at your option) any later version.\n+\n+ This program is distributed in the hope that it will be useful,\n+ but WITHOUT ANY WARRANTY; without even the implied warranty of\n+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n+ GNU General Public License for more details.\n+\n+ You should have received a copy of the GNU General Public License\n+ along with this program. If not, see <http://www.gnu.org/licenses/>.\n+\n+Also add information on how to contact you by electronic and paper mail.\n+\n+ If the program does terminal interaction, make it output a short\n+notice like this when it starts in an interactive mode:\n+\n+ {project} Copyright (C) {year} {fullname}\n+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w\'.\n+ This is free software, and you are welcome to redistribute it\n+ under certain conditions; type `show c\' for details.\n+\n+The hypothetical commands `show w\' and `show c\' should show the appropriate\n+parts of the General Public License. Of course, your program\'s commands\n+might be different; for a GUI interface, you would use an "about box".\n+\n+ You should also get your employer (if you work as a programmer) or school,\n+if any, to sign a "copyright disclaimer" for the program, if necessary.\n+For more information on this, and how to apply and follow the GNU GPL, see\n+<http://www.gnu.org/licenses/>.\n+\n+ The GNU General Public License does not permit incorporating your program\n+into proprietary programs. If your program is a subroutine library, you\n+may consider it more useful to permit linking proprietary applications with\n+the library. If this is what you want to do, use the GNU Lesser General\n+Public License instead of this License. But first, please read\n+<http://www.gnu.org/philosophy/why-not-lgpl.html>.\n' |
b |
diff -r 000000000000 -r 79f00bc83ecc normalization.R --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/normalization.R Thu Jan 18 06:20:30 2018 -0500 |
[ |
b'@@ -0,0 +1,282 @@\n+# R Script implementing different kind of normalisation \r\n+# Input : a file containing a table with numeric values\r\n+#\t except for the first column containing sample names\r\n+#\t and the first line containing variable names\r\n+#\t separator expected is <TAB>\r\n+#\r\n+# Normalization method :\r\n+#\t log, DESeq2, Rlog, Standard_score, Pareto, TSS, TSS+CLR, Pareto\r\n+#\r\n+# Ouptut : input table with values normalized according\r\n+#\t to the normalization procedure chosen\r\n+#-----------------------------------------------------------------\r\n+# Authors : luc.jouneau(at)inra.fr\r\n+#\t valentin.marcon(at)inra.fr\r\n+# Version : 0.9\r\n+# Date : 30/08/2017\r\n+#-----------------------------------------------------------------\r\n+\r\n+normalization=function(\r\n+##########################################################\r\n+# Function input\r\n+##########################################################\r\n+#Possible values : "log", "DESeq2", "Rlog", "Standard_score", "Pareto", "TSS", "TSS_CLR"\r\n+transformation_method="Standard_score",\r\n+na_encoding="NA",\r\n+#Path to file containg table of values (separator="tab")\r\n+input_file="",\r\n+#Path to file produced after transformation\r\n+output_file="out/table_out.txt",\r\n+#Path to file containing messages for user if something bad happens\r\n+log_file="log/normalization_report.html",\r\n+#Boolean flag (0/1) indicating if variables are in line or in columns\r\n+variable_in_line="1") {\r\n+\r\n+##########################################################\r\n+# Read and verify data\r\n+##########################################################\r\n+#1\xb0) Checks valids for all modules\r\n+if (variable_in_line=="1") {\r\n+\tcolumn_use="individual"\r\n+\tline_use="variable"\r\n+} else {\r\n+\tline_use="individual"\r\n+\tcolumn_use="variable"\r\n+}\r\n+log_error=function(message="") {\r\n+\t\tcat("<HTML><HEAD><TITLE>Normalization report</TITLE></HEAD><BODY>\\n",file=log_file,append=F,sep="")\r\n+\t\tcat("⚠ An error occurred while trying to read your table.\\n<BR>",file=log_file,append=T,sep="")\r\n+\t\tcat("Please check that:\\n<BR>",file=log_file,append=T,sep="")\r\n+\t\tcat("<UL>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> the table you want to process contains the same number of columns for each line</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> the first line of your table is a header line (specifying the name of each ",column_use,")</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> the first column of your table specifies the name of each ",line_use,"</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> both individual and variable names should be unique</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> each value is separated from the other by a <B>TAB</B> character</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> except for first line and first column, table should contain a numeric value</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat(" <LI> this value may contain character \'.\' as decimal separator or \'",na_encoding,"\' for missing values</LI>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat("</UL>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat("-------<BR>\\nError messages recieved :<BR><FONT color=red>\\n",conditionMessage(message),"</FONT>\\n",file=log_file,append=T,sep="")\r\n+\t\tcat("</BODY></HTML>\\n",file=log_file,append=T,sep="")\r\n+\t\tq(save="no",status=1)\r\n+}\r\n+\r\n+tab_in=tryCatch(\r\n+\t{\r\n+\t\ttab_in=read.table(file=input_file,sep="\\t",header=T,quote="\\"",na.strings=na_encoding,check.names=FALSE)\r\n+\t},\r\n+\terror=function(cond) {\r\n+\t\tlog_error(message=cond)\r\n+\t\treturn(NA)\r\n+\t},\r\n+\twarning=function(cond) {\r\n+\t\tlog_error(message=cond)\r\n+\t\treturn(NA)\r\n+\t},\r\n+\tfinally={\r\n+\t\t#Do nothing special\r\n+\t}\r\n+)\r\n+\r\n+if (ncol(tab_in)<2) {\r\n+\tlog_error(simpleCondition("The table you want to normalize contains less than two columns."))\r\n+}\r\n+\r\n+rn=as.character(tab_in[,1])\r\n+if (length(rn)!=length(unique(rn))) {\r\n+\tduplicated_rownames=table(rn)\r\n+\tduplicated_rownames=duplicated_rownames[duplicated_rownames>1]\r\n+\tduplicated_rownames=names(duplicated_ro'..b'hecks\r\n+##########################################################\r\n+\r\n+### Transpose if variable are in line ###\r\n+if (variable_in_line=="1") {\r\n+\t#Transpose matrix\r\n+\ttab=t(tab)\r\n+}\r\n+\r\n+##########################################################\r\n+### Value transformation\r\n+##########################################################\r\n+\r\n+#Avoid null values when there is a log transformation\r\n+na.replaced=c()\r\n+log.transformed=FALSE\r\n+if (transformation_method %in% c("log","TSS_CLR")) {\r\n+\tlog.transformed=TRUE\r\n+\tfor (idx_col in 1:ncol(tab)) {\r\n+\t\tsel=tab[,idx_col]==0\r\n+\t\tna.replaced=cbind(na.replaced,sel)\r\n+\t\ttab[sel,idx_col]=1e-2\r\n+\t}\r\n+}\r\n+\r\n+### log ###\r\n+if (transformation_method=="log") {\r\n+\ttab=log2(tab)\r\n+}\r\n+\r\n+### DESeq2 or Rlog ###\r\n+if (transformation_method %in% c("DESeq2","Rlog")) {\r\n+\tlibrary(DESeq2)\r\n+\tn <- ncol(tab)\r\n+\tdds <- DESeqDataSetFromMatrix(tab,\r\n+\t\t\t\t colData = data.frame(condition = c("a", rep("b", n - 1))),\r\n+\t\t\t\t design = formula(~ condition))\r\n+\tcolnames(dds) <- colnames(tab)\r\n+\tdds <- estimateSizeFactors(dds)\r\n+\ttab <- switch(transformation_method,\r\n+ DESeq2 = counts(dds, normalized = TRUE),\r\n+ Rlog = assay(rlogTransformation(dds))\r\n+\t)\r\n+}\r\n+\r\n+### Standard_score ###\r\n+if (transformation_method=="Standard_score") {\r\n+\ttab=scale(tab)\r\n+}\r\n+\r\n+### Pareto ###\r\n+if (transformation_method=="Pareto") {\r\n+\ttab.centered <- apply(tab, 2, function(x) x - mean(x,na.rm=TRUE))\r\n+\ttab.sc <- apply(tab.centered, 2, function(x) x/sqrt(sd(x,na.rm=TRUE)))\r\n+\ttab=tab.sc\r\n+}\r\n+\r\n+### TSS ###\r\n+if (transformation_method=="TSS") {\r\n+\ttab= t(apply(tab, 1, function(x) x/sum(x,na.rm=TRUE)))\r\n+}\r\n+\r\n+### TSS + CLR avec function de mixOmics ###\r\n+if (transformation_method=="TSS_CLR") {\r\n+\t#From http://stackoverflow.com/questions/2602583/geometric-mean-is-there-a-built-in\r\n+\tgeometric.mean = function(x, na.rm=TRUE){\r\n+\t\texp(sum(log(x[x > 0]), na.rm=na.rm) / length(x))\r\n+\t}\r\n+\ttab = t(apply(tab+1e-2,1,function(x) log(x/geometric.mean(x,na.rm=TRUE))))\r\n+}\r\n+\r\n+\r\n+#If there is a log transformation put 0 where there was NA\r\n+if (log.transformed) {\r\n+\tfor (idx_col in 1:ncol(tab)) {\r\n+\t\ttab[na.replaced[,idx_col],idx_col]=0\r\n+\t}\r\n+}\r\n+\r\n+#If there are missing values, replace it with NA_enconding\r\n+for (idx_col in 1:ncol(tab)) {\r\n+\tsel=is.na(tab[,idx_col])\r\n+\ttab[sel,idx_col]=na_encoding\r\n+}\r\n+\r\n+##########################################################\r\n+# Prepare and write output table\r\n+##########################################################\r\n+if (variable_in_line=="1") {\r\n+\t#Transpose matrix again\r\n+\ttab=t(tab)\r\n+}\r\n+\r\n+tab_out=cbind(rownames(tab),tab)\r\n+colnames(tab_out)[1]=colnames(tab_in)[1]\r\n+\r\n+write.table(file=output_file,tab_out,sep="\\t",row.names=F,quote=F)\r\n+\r\n+##########################################################\r\n+# Treatment successfull\r\n+##########################################################\r\n+cat("<HTML><HEAD><TITLE>Normalization report</TITLE></HEAD><BODY>\\n",file=log_file,append=F,sep="")\r\n+cat(paste("➔ You choose to apply the transformation method :",transformation_method,"<BR>"),file=log_file,append=F,sep="")\r\n+cat("✓ Your normalization process is successfull !<BR>",file=log_file,append=T,sep="")\r\n+cat("</BODY></HTML>\\n",file=log_file,append=T,sep="")\r\n+\r\n+q(save="no",status=0)\r\n+\r\n+} # end of function\r\n+\r\n+##########################################################\r\n+# Test\r\n+##########################################################\r\n+#Used for debug : LJO 6/3/2017\r\n+#normalization()\r\n+#setwd("H:/INRA/cati/groupe stats/Galaxy/normalisation")\r\n+#normalization(transformation_method="Standard_score",na_encoding="NA",input_file="datasets/valid - decathlon.txt",output_file="out/table_out.txt",log_file="log/normalization.html",variable_in_line="0")\r\n+#normalization(transformation_method="Pareto",na_encoding="NA",input_file="datasets/valid - decathlon.txt",output_file="out/table_out.txt",log_file="log/normalization.html",variable_in_line="1")\r\n+\r\n' |
b |
diff -r 000000000000 -r 79f00bc83ecc normalization.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/normalization.xml Thu Jan 18 06:20:30 2018 -0500 |
[ |
b'@@ -0,0 +1,229 @@\n+<!--# Copyright (C) 2017 INRA\n+# This program is free software: you can redistribute it and/or modify\n+# it under the terms of the GNU General Public License as published by\n+# the Free Software Foundation, either version 3 of the License, or\n+# (at your option) any later version.\n+#\n+# This program is distributed in the hope that it will be useful,\n+# but WITHOUT ANY WARRANTY; without even the implied warranty of\n+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n+# GNU General Public License for more details.\n+# \n+# You should have received a copy of the GNU General Public License\n+# along with this program. If not, see http://www.gnu.org/licenses/.\n+#-->\n+\n+<tool id="normalization" name="Normalization" version="1.0.0">\n+ <description>Normalize your data with some well known methods</description>\n+ <requirements>\n+ <requirement type="package">R</requirement>\n+ <requirement type="package">bioconductor-deseq2</requirement>\n+ <requirement type="package">r-batch</requirement>\n+ </requirements>\n+ <stdio>\n+ <!-- Anything other than zero is an error -->\n+ <exit_code range="1:" level="fatal"/>\n+ <exit_code range=":-1" level="fatal"/>\n+ </stdio>\n+ <command interpreter="Rscript"><![CDATA[\n+ normalization_galaxy.R\n+ input_file \'${input_file}\'\n+ transformation_method \'${transformation_method}\'\n+ na_encoding \'${na_encoding}\'\n+ output_file \'${output_file}\'\n+ log_file \'${log_file}\'\n+ variable_in_line \'${variable_in_line}\'\n+ ]]></command>\n+ <inputs> \n+ <param format="tabular,csv" name="input_file" type="data" label="Input file"/>\n+ <param name="transformation_method" type="select" label="Data transformation method" help="See the complete help below for more details"> \n+ <option value="log">Log (binary logarithm)</option>\n+ <option value="DESeq2">DESeq2 for NGS counts</option>\n+ <option value="Rlog">RLog (as implemented in DESeq2)</option>\n+ <option value="Standard_score">Standard score (mean=0;sd=1) </option>\n+ <option value="Pareto">Pareto (mean=0;sd moderate)</option>\n+ <option value="TSS">Total sum scaling (TSS)</option>\n+ <option value="TSS_CLR">Total sum scaling + log ratio (TSS+CLR)</option>\n+ <validator type="empty_field" message="Please choose, at least, one data transformation method." />\n+ </param> \n+ <param name="na_encoding" size="30" type="text" value="NA" label="Label used for Missing values"/> \n+ <param name="variable_in_line" type="select" multiple="false" display="radio" label="Variable in line or column?">\n+ <option value="1">Line</option>\n+ <option value="0">Column</option>\n+ </param>\n+ </inputs>\n+ <outputs>\n+ <data name="log_file" format="html" label="Normalization_log"/>\n+ <data name="output_file" format_source="input_file" label="Transfo-${transformation_method.value}_${input_file.name}"/>\n+ </outputs>\n+ <tests>\n+ <test>\n+ <param name="input_file" value="decathlon.tsv"/>\n+ <param name="transformation_method" value="log"/>\n+ <param name="na_encoding" value="NA"/>\n+ <param name="variable_in_line" value="0"/>\n+ <output name="log_file" file="log_file"/>\n+ <output name="output_file" file="output_file"/>\n+ </test>\n+ </tests>\n+ <help><![CDATA[\n+\n+=========\n+Normalize\n+=========\n+\n+-----------\n+Description\n+-----------\n+\n+ - This tool is part of a set of statistical tools made by members of the BIOS4BIOL group ("Normalization", "Summary statistics", "Hierarchical clustering" and "PCAFactoMineR").\n+ - Please use this Normalization module before using other modules of the suite.\n+\n+What it does: \n+ - It normalize your data with some well known methods\n+\n+------\n+\n+-----------\n+Input fil'..b"| Check if this nature of data is adapted to the type of analysis you want to do\n+\n+If your nature of data is not adapted to the analysis you plan to do, you should first transform your data in a scale of values which fits better requirement of your analysis.\n+This transformation process is named \xe2\x80\x9cnormalization\xe2\x80\x9d.\n+\n+\n+---------------------\n+Normalization Methods\n+---------------------\n+\n+In this Galaxy module, we propose several normalization methods, and we provide some guidelines to help user choose the accurate normalization method:\n+\n+Log normalization\n+ | -Objective: Binary logarithm provide homogeneity of variance even if the range of values is pretty large\n+ | -Accepted: values Any positive or null real numbers\n+ | (null values, will stay null after transformation)\n+ | -Range of values: Input: [0;100.000] / Output: [0;17]\n+ | -Adapted for: PCA, HC, SS*\n+ | \n+\n+DESeq2 normalization\n+ | -Objective: Obtain comparable counts between samples, whatever the difference of their libraries sequencing depth\n+ | -Accepted values: NGS counts (positive integers ; no missing values)\n+ | (null values, will stay null after transformation)\n+ | -Range of values: Input: [0;100.000] / Output: [0; 100.000]\n+ | -Adapted for: Differential analysis\n+ | \n+\n+RLog normalization\n+ | -Objective: Similar to a combination of {DESeq2 + Log} transformation\n+ | -Accepted values: NGS counts (positive integers ; no missing values)\n+ | -Range of values: Input: [0;100.000] / Output: [0; 20]\n+ | -Adapted for: PCA, HC, SS\n+ | \n+\n+Standard score normalization\n+ | -Objective: Transform values such as {mean=0 and standard deviation=1} for all variables.\n+ | -Accepted values: No specific constraint\n+ | -Range of values: No specific constraint\n+ | -Adapted for: PCA, HC, SS\n+ | \n+\n+Pareto normalization\n+ | -Objective: Transform values such as\n+ | {mean=0 and variance equal to its standard deviation instead of unit variance} for all variables.\n+ | -Accepted values: No specific constraint\n+ | -Range of values: No specific constraint\n+ | -Adapted for: metabolite intensity values before PCA, HC, SS\n+ | \n+\n+Total sum scaling normalization (TSS)\n+ | -Objective: Normalizes count data by dividing variable read count by the total number of read counts in each individual sample\n+ | -Accepted values: 16S rRNA amplicon sequencing\n+ | -Range of values: Input: no specific constraint / Output: [0;1[\n+ | -Adapted for: PCA, HC, SS\n+ | \n+\n+Total sum scaling+Log ratio normalization (TSS+CLR)\n+ | -Objective: Transform values such as {mean=0 and standard deviation=1} for all variables.\n+ | -Accepted values: 16S rRNA amplicon sequencing\n+ | -Range of values: Input: no specific constraint / Output: [0;1[\n+ | -Adapted for: PCA, HC, SS\n+\n+(*)PCA: Principal Component Analysis / HC: Hierarchical Clustering / SS: Summary Statistics\n+\n+------\n+\n+**Authors**: Luc Jouneau (luc.jouneau@inra.fr), Sarah Maman (sarah.maman@inra.fr) and Valentin Marcon (valentin.marcon@inra.fr) \n+\n+Contact : support.sigenae@inra.fr\n+\n+E-learning available : Not yet.\n+\n+.. class:: infomark\n+\n+-------------\n+Please cite :\n+-------------\n+\n+- (Depending on the help provided you can cite us in acknowledgements, references or both.)\n+ \n+Acknowledgements\n+ | We wish to thank SIGENAE group and the statistical CATI BIOS4Biol group : Luc Jouneau, Sarah Maman\n+ | Re-packaging was provided by Valentin Marcon (INRA, Migale platform http://migale.jouy.inra.fr), as part of the IFB project 'Galaxy For Life Science' (http://www.france-bioinformatique.fr/fr)\n+ | \n+ \n+References\n+ | SIGENAE [http://www.sigenae.org/]\n+ |\n+\n+ ]]></help>\n+</tool>\n" |
b |
diff -r 000000000000 -r 79f00bc83ecc normalization_galaxy.R --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/normalization_galaxy.R Thu Jan 18 06:20:30 2018 -0500 |
[ |
@@ -0,0 +1,45 @@ +#!/usr/local/bioinfo/bin/Rscript --vanilla --slave --no-site-file + +# R Script making the bridge between Galaxy and the call of the normalization method +#----------------------------------------------------------------- +# Authors : luc.jouneau(at)inra.fr +# valentin.marcon(at)inra.fr +# Version : 0.9 +# Date : 30/08/2017 +#--------------------------------------------------------------- + +##------------------------------ +## Options +##------------------------------ +strAsFacL <- options()$stringsAsFactors +options(stringsAsFactors = FALSE) + +##------------------------------ +## Libraries laoding +##------------------------------ +# For parseCommandArgs function +library(batch) + +# R script call +source_local <- function(fname) +{ + argv <- commandArgs(trailingOnly = FALSE) + base_dir <- dirname(substring(argv[grep("--file=", argv)], 8)) + source(paste(base_dir, fname, sep="/")) +} + +#Import the different functions used for Normalization +source_local("normalization.R") + +##------------------------------ +## Lecture parametres +##------------------------------ +argLs <- parseCommandArgs(evaluate=FALSE) + +normalization(input_file=argLs[["input_file"]], + transformation_method=argLs[["transformation_method"]], + na_encoding=argLs[["na_encoding"]], + output_file=argLs[["output_file"]], + log_file=argLs[["log_file"]], + variable_in_line=argLs[["variable_in_line"]]) + |
b |
diff -r 000000000000 -r 79f00bc83ecc test-data/decathlon.tsv --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/decathlon.tsv Thu Jan 18 06:20:30 2018 -0500 |
b |
@@ -0,0 +1,42 @@ +"name" "100m" "Long.jump" "Shot.put" "High.jump" "400m" "110m.hurdle" "Discus" "Pole.vault" "Javeline" "1500m" +"SEBRLE" 11.04 7.58 14.83 2.07 49.81 14.69 43.75 5.02 63.19 291.7 +"CLAY" 10.76 7.4 14.26 1.86 49.37 14.05 50.72 4.92 60.15 301.5 +"KARPOV" 11.02 7.3 14.77 2.04 48.37 14.09 48.95 4.92 50.31 300.2 +"BERNARD" 11.02 7.23 14.25 1.92 48.93 14.99 40.87 5.32 62.77 280.1 +"YURKOV" 11.34 7.09 15.19 2.1 50.42 15.31 46.26 4.72 63.44 276.4 +"WARNERS" 11.11 7.6 14.31 1.98 48.68 14.23 41.1 4.92 51.77 278.1 +"ZSIVOCZKY" 11.13 7.3 13.48 2.01 48.62 14.17 45.67 4.42 55.37 268 +"McMULLEN" 10.83 7.31 13.76 2.13 49.91 14.38 44.41 4.42 56.37 285.1 +"MARTINEAU" 11.64 6.81 14.57 1.95 50.14 14.93 47.6 4.92 52.33 262.1 +"HERNU" 11.37 7.56 14.41 1.86 51.1 15.06 44.99 4.82 57.19 285.1 +"BARRAS" 11.33 6.97 14.09 1.95 49.48 14.48 42.1 4.72 55.4 282 +"NOOL" 11.33 7.27 12.68 1.98 49.2 15.29 37.92 4.62 57.44 266.6 +"BOURGUIGNON" 11.36 6.8 13.46 1.86 51.16 15.67 40.49 5.02 54.68 291.7 +"Sebrle" 10.85 7.84 16.36 2.12 48.36 14.05 48.72 5 70.52 280.01 +"Clay" 10.44 7.96 15.23 2.06 49.19 14.13 50.11 4.9 69.71 282 +"Karpov" 10.5 7.81 15.93 2.09 46.81 13.97 51.65 4.6 55.54 278.11 +"Macey" 10.89 7.47 15.73 2.15 48.97 14.56 48.34 4.4 58.46 265.42 +"Warners" 10.62 7.74 14.48 1.97 47.97 14.01 43.73 4.9 55.39 278.05 +"Zsivoczky" 10.91 7.14 15.31 2.12 49.4 14.95 45.62 4.7 63.45 269.54 +"Hernu" 10.97 7.19 14.65 2.03 48.73 14.25 44.72 4.8 57.76 264.35 +"Nool" 10.8 7.53 14.26 1.88 48.81 14.8 42.05 5.4 61.33 276.33 +"Bernard" 10.69 7.48 14.8 2.12 49.13 14.17 44.75 4.4 55.27 276.31 +"Schwarzl" 10.98 7.49 14.01 1.94 49.76 14.25 42.43 5.1 56.32 273.56 +"Pogorelov" 10.95 7.31 15.1 2.06 50.79 14.21 44.6 5 53.45 287.63 +"Schoenbeck" 10.9 7.3 14.77 1.88 50.3 14.34 44.41 5 60.89 278.82 +"Barras" 11.14 6.99 14.91 1.94 49.41 14.37 44.83 4.6 64.55 267.09 +"Smith" 10.85 6.81 15.24 1.91 49.27 14.01 49.02 4.2 61.52 272.74 +"Averyanov" 10.55 7.34 14.44 1.94 49.72 14.39 39.88 4.8 54.51 271.02 +"Ojaniemi" 10.68 7.5 14.97 1.94 49.12 15.01 40.35 4.6 59.26 275.71 +"Smirnov" 10.89 7.07 13.88 1.94 49.11 14.77 42.47 4.7 60.88 263.31 +"Qi" 11.06 7.34 13.55 1.97 49.65 14.78 45.13 4.5 60.79 272.63 +"Drews" 10.87 7.38 13.07 1.88 48.51 14.01 40.11 5 51.53 274.21 +"Parkhomenko" 11.14 6.61 15.69 2.03 51.04 14.88 41.9 4.8 65.82 277.94 +"Terek" 10.92 6.94 15.15 1.94 49.56 15.12 45.62 5.3 50.62 290.36 +"Gomez" 11.08 7.26 14.57 1.85 48.61 14.41 40.95 4.4 60.71 269.7 +"Turi" 11.08 6.91 13.62 2.03 51.67 14.26 39.83 4.8 59.34 290.01 +"Lorenzo" 11.1 7.03 13.22 1.85 49.34 15.38 40.22 4.5 58.36 263.08 +"Karlivans" 11.33 7.26 13.3 1.97 50.54 14.98 43.34 4.5 52.92 278.67 +"Korkizoglou" 10.86 7.07 14.81 1.94 51.16 14.96 46.07 4.7 53.05 317 +"Uldal" 11.23 6.99 13.53 1.85 50.95 15.09 43.01 4.5 60 281.7 +"Casarsa" 11.36 6.68 14.92 1.94 53.2 15.39 48.66 4.4 58.62 296.12 |
b |
diff -r 000000000000 -r 79f00bc83ecc test-data/log_file --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/log_file Thu Jan 18 06:20:30 2018 -0500 |
b |
@@ -0,0 +1,1 @@ +➔ You choose to apply the transformation method : log <BR>✓ Your normalization process is successfull !<BR></BODY></HTML> |
b |
diff -r 000000000000 -r 79f00bc83ecc test-data/output_file --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/output_file Thu Jan 18 06:20:30 2018 -0500 |
b |
@@ -0,0 +1,42 @@ +name 100m Long.jump Shot.put High.jump 400m 110m.hurdle Discus Pole.vault Javeline 1500m +SEBRLE 3.46466826700344 2.92219784839637 3.89044669267991 1.0496307677246 5.63836350589786 3.87676249078156 5.45121111183233 2.32768736417605 5.98162436069636 8.18834157601082 +CLAY 3.4276061727819 2.88752527074159 3.83390207666916 0.895302621333307 5.62556273996637 3.81249822533356 5.66448284036468 2.29865831556452 5.91049283228871 8.23601419190008 +KARPOV 3.46205231879643 2.86789646399265 3.88459792099006 1.02856915219677 5.59604063267114 3.81659970653491 5.61323695471633 2.29865831556452 5.65277328451078 8.22978016673333 +BERNARD 3.46205231879643 2.85399564717639 3.83289001416474 0.941106310946431 5.61264737765832 3.90592847817313 5.35297033824593 2.41142624572647 5.97200330380836 8.12979817318714 +YURKOV 3.5033487351675 2.82578562746479 3.92504996472736 1.0703893278914 5.65592421308382 3.93640237772506 5.53169336086147 2.23878685958712 5.98732086592925 8.1106138055009 +WARNERS 3.47378691161437 2.92599941855622 3.83895176695194 0.985500430304885 5.605257262939 3.83086375675176 5.36106648879432 2.29865831556452 5.6940444130862 8.11945993445933 +ZSIVOCZKY 3.47638168756724 2.86789646399265 3.75274859140713 1.0071955014042 5.603477988254 3.82476785314329 5.51317488460363 2.14404636961671 5.79103261675567 8.06608919045777 +McMULLEN 3.4369613378336 2.86987140617771 3.78240856492737 1.09085343045111 5.64125699872677 3.84599177066457 5.47281266619239 2.14404636961671 5.81685566236649 8.15532422905059 +MARTINEAU 3.54101915313356 2.76765479823735 3.86492897228979 0.963474123974886 5.64789009105921 3.90014226035982 5.57288966842058 2.29865831556452 5.70956635383999 8.03397354344106 +HERNU 3.50716034911752 2.91838623444635 3.84899843040002 0.895302621333307 5.67525138605026 3.9126498648972 5.49153246180432 2.26903314645524 5.83769100042856 8.15532422905059 +BARRAS 3.50207595604579 2.8011586560937 3.81659970653491 0.963474123974886 5.62877359520165 3.85598969730848 5.39574832817903 2.23878685958712 5.79181407116183 8.13955135239879 +NOOL 3.50207595604579 2.86195536414487 3.66448284036468 0.985500430304885 5.62058641045188 3.93451650158608 5.24488705912353 2.20789285164133 5.84398384404833 8.05853297020161 +BOURGUIGNON 3.50589092972996 2.76553474636298 3.75060650483559 0.895302621333307 5.67694435910691 3.96993327469786 5.33949373790173 2.32768736417605 5.77294133783134 8.18834157601082 +Sebrle 3.43962313755712 2.97085365434048 4.03210084316702 1.08406426478847 5.5957423394744 3.81249822533356 5.60644222813161 2.32192809488736 6.13996056954546 8.12933454084779 +Clay 3.38404980679516 2.99276843076892 3.92884403671257 1.04264433740849 5.6202931499486 3.82068956055921 5.64702663265485 2.29278174922785 6.12329372251013 8.13955135239879 +Karpov 3.39231742277876 2.96532254836725 3.99367436175058 1.06350294230616 5.54874485993723 3.80426011563474 5.6906964439777 2.20163386116965 5.79545527204412 8.11951181037098 +Macey 3.44493204894218 2.90110824301451 3.97544676564096 1.10433665981474 5.61382629093404 3.86393845042397 5.59514556799086 2.13750352374994 5.86937792403133 8.05213327492777 +Warners 3.40871186102943 2.95233356636969 3.85598969730848 0.978195629681652 5.58406053442676 3.80838505065609 5.45055144330648 2.29278174922785 5.7915536333886 8.11920052691734 +Zsivoczky 3.44757919654888 2.83592407425437 3.93640237772506 1.08406426478847 5.62643913669732 3.90207357931074 5.51159454153741 2.23266075679027 5.98754825895374 8.07435557599732 +Hernu 3.45549162062847 2.84599177066457 3.87282875953489 1.02147972741045 5.60673831741759 3.83289001416474 5.48284828306847 2.26303440583379 5.85199883711245 8.04630551649358 +Nool 3.43295940727611 2.9126498648972 3.83390207666916 0.910732661902913 5.60910484661896 3.88752527074159 5.39403389536778 2.43295940727611 5.93852104586597 8.1102483878344 +Bernard 3.41818994794577 2.90303827011291 3.88752527074159 1.08406426478847 5.61853233397629 3.82476785314329 5.48381577726426 2.13750352374994 5.78842470743944 8.11014396578449 +Schwarzl 3.45680614923047 2.90496571868403 3.80838505065609 0.956056652412403 5.63691458035588 3.83289001416474 5.40701277351601 2.35049724708413 5.81557542886257 8.09571348425126 +Pogorelov 3.45285896471381 2.86987140617771 3.91647664443772 1.04264433740849 5.66647256884207 3.82883464946806 5.47897180503294 2.32192809488736 5.74011804283313 8.16807034745054 +Schoenbeck 3.44625622988956 2.86789646399265 3.88459792099006 0.910732661902913 5.65248649491816 3.84197311892718 5.47281266619239 2.32192809488736 5.92813340782985 8.12319024045716 +Barras 3.47767732756531 2.80529245560071 3.89820835250872 0.956056652412403 5.62673115067279 3.84498815668261 5.48639259427596 2.20163386116965 6.01234519041983 8.06118215144418 +Smith 3.43962313755712 2.76765479823735 3.9297909977186 0.933572638261024 5.62263756653746 3.80838505065609 5.61529857909211 2.0703893278914 5.9429835981871 8.09138249094364 +Averyanov 3.39917109381982 2.87578006306849 3.85199883711245 0.956056652412403 5.63575439127776 3.84699468696557 5.31759350462347 2.26303440583379 5.76844901518055 8.08225550938395 +Ojaniemi 3.41683974191283 2.90689059560852 3.90400231628369 0.956056652412403 5.61823865559545 3.90785207184596 5.33449676839042 2.20163386116965 5.88898672118656 8.10700778461203 +Smirnov 3.44493204894218 2.82171021503467 3.79493566280354 0.956056652412403 5.61794491742059 3.88459792099006 5.40837220357268 2.23266075679027 5.92789645372882 8.04061850294165 +Qi 3.46727948045998 2.87578006306849 3.76022094646651 0.978195629681652 5.63372181264101 3.88557436437143 5.49601487343776 2.16992500144231 5.92576211367192 8.09080051372945 +Drews 3.44228003525258 2.88362081628567 3.70818723602071 0.910732661902913 5.60021027442009 3.80838505065609 5.32589006103978 2.32192809488736 5.6873406873783 8.09913737463126 +Parkhomenko 3.47767732756531 2.72465027173297 3.97177344719337 1.02147972741045 5.67355642399014 3.89530262133331 5.38887833881199 2.26303440583379 6.04045412134962 8.11862966608686 +Terek 3.44890095114513 2.79493566280354 3.92124588858559 0.956056652412403 5.63110428236588 3.91838623444635 5.51159454153741 2.40599235967584 5.66163560233596 8.18169891109611 +Gomez 3.46988597627446 2.85996954822103 3.86492897228979 0.887525270741587 5.60318122901386 3.84899843040002 5.35579154675365 2.13750352374994 5.92386226807603 8.07521171134824 +Turi 3.46988597627446 2.78868571061353 3.76765479823735 1.02147972741045 5.69125497865014 3.83390207666916 5.31578357458946 2.26303440583379 5.89093302170554 8.17995883726209 +Lorenzo 3.47248777146274 2.81352468929781 3.72465027173297 0.887525270741587 5.6246858105254 3.9429835981871 5.32984117653063 2.16992500144231 5.86690797814317 8.0393577651597 +Karlivans 3.50207595604579 2.85996954822103 3.73335434061383 0.978195629681652 5.65935375917005 3.90496571868403 5.43762724831895 2.16992500144231 5.72574115650395 8.12241388837423 +Korkizoglou 3.44095219802964 2.82171021503467 3.88849973551412 0.956056652412403 5.67694435910691 3.90303827011291 5.52575569282949 2.23266075679027 5.7292808460274 8.30833903013941 +Uldal 3.48928602262588 2.80529245560071 3.75808993420181 0.887525270741587 5.67101024127845 3.91552090075196 5.42660022616993 2.16992500144231 5.90689059560852 8.13801575348757 +Casarsa 3.50589092972996 2.73984810269933 3.89917563048051 0.956056652412403 5.73335434061383 3.94392132655349 5.6046644151637 2.13750352374994 5.87332106291482 8.21003812347289 |