# HG changeset patch # User artbio # Date 1731016921 0 # Node ID cc768b0f41cffed9787fbdc8793beeccb3dc0798 # Parent 6864acb21714adeeb01210c30462e5853bfe813a planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/main/tools/gsc_scran_normalize commit 9ab82433f375b37be5c9acb22e5deb798081dc3b diff -r 6864acb21714 -r cc768b0f41cf scran-normalize.R --- a/scran-normalize.R Sun Dec 10 00:27:45 2023 +0000 +++ b/scran-normalize.R Thu Nov 07 22:02:01 2024 +0000 @@ -1,8 +1,9 @@ -options(show.error.messages = FALSE, - error = function() { - cat(geterrmessage(), file = stderr()) - q("no", 1, FALSE) - } +options( + show.error.messages = FALSE, + error = function() { + cat(geterrmessage(), file = stderr()) + q("no", 1, FALSE) + } ) loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") warnings() @@ -13,63 +14,63 @@ # Arguments option_list <- list( - make_option( - c("-d", "--data"), - default = NA, - type = "character", - help = "Input file that contains count values to transform" - ), - make_option( - "--cluster", - default = FALSE, - action = "store_true", - type = "logical", - help = "Whether to calculate the size factor per cluster or on all cell" - ), - make_option( - c("-m", "--method"), - default = "hclust", - type = "character", - help = "The clustering method to use for grouping cells into cluster : hclust or igraph [default : '%default' ]" - ), - make_option( - "--size", - default = 100, - type = "integer", - help = "Minimal number of cells in each cluster : hclust or igraph [default : '%default' ]" - ), - make_option( - c("-o", "--out"), - default = "res.tab", - type = "character", - help = "Output name [default : '%default' ]" - ) + make_option( + c("-d", "--data"), + default = NA, + type = "character", + help = "Input file that contains count values to transform" + ), + make_option( + "--cluster", + default = FALSE, + action = "store_true", + type = "logical", + help = "Whether to calculate the size factor per cluster or on all cell" + ), + make_option( + c("-m", "--method"), + default = "hclust", + type = "character", + help = "The clustering method to use for grouping cells into cluster : hclust or igraph [default : '%default' ]" + ), + make_option( + "--size", + default = 100, + type = "integer", + help = "Minimal number of cells in each cluster : hclust or igraph [default : '%default' ]" + ), + make_option( + c("-o", "--out"), + default = "res.tab", + type = "character", + help = "Output name [default : '%default' ]" + ) ) opt <- parse_args(OptionParser(option_list = option_list), - args = commandArgs(trailingOnly = TRUE)) + args = commandArgs(trailingOnly = TRUE) +) data <- read.table( - opt$data, - check.names = FALSE, - header = TRUE, - row.names = 1, - sep = "\t" + opt$data, + check.names = FALSE, + header = TRUE, + row.names = 1, + sep = "\t" ) ## Import data as a SingleCellExperiment object sce <- SingleCellExperiment(list(counts = as.matrix(data))) if (opt$cluster) { - clusters <- quickCluster(sce, min.size = opt$size, method = opt$method) + clusters <- quickCluster(sce, min.size = opt$size, method = opt$method) - ## Compute sum factors - sce <- computeSumFactors(sce, cluster = clusters) + ## Compute sum factors + sce <- computeSumFactors(sce, cluster = clusters) } else { - - ## Compute sum factors - sce <- computeSumFactors(sce) + ## Compute sum factors + sce <- computeSumFactors(sce) } sce <- logNormCounts(sce) @@ -78,10 +79,10 @@ write.table( - logcounts, - opt$out, - col.names = TRUE, - row.names = FALSE, - quote = FALSE, - sep = "\t" + logcounts, + opt$out, + col.names = TRUE, + row.names = FALSE, + quote = FALSE, + sep = "\t" ) diff -r 6864acb21714 -r cc768b0f41cf scran_normalize.xml --- a/scran_normalize.xml Sun Dec 10 00:27:45 2023 +0000 +++ b/scran_normalize.xml Thu Nov 07 22:02:01 2024 +0000 @@ -1,5 +1,8 @@ - + Normalize raw counts expression values using deconvolution size factors + + galaxy_single_cell_suite + bioconductor-scran r-dynamictreecut @@ -71,8 +74,10 @@ Cell-specific biases are normalized using the computeSumFactors method, which implements the deconvolution strategy for scaling normalization (A. T. Lun, Bach, and Marioni 2016). It creates a reference : - - if no clustering step : the average count of all transcriptomes - - if you choose to cluster your cells : the average count of each cluster. + +- if no clustering step : the average count of all transcriptomes +- if you choose to cluster your cells : the average count of each cluster. + Then it pools cells and then sum their expression profiles. The size factor is described as the median ration between the count sums and the average across all genes. Finally it constructs a linear distribution (deconvolution method) of size factors by taking multiple pools of cells. @@ -80,9 +85,8 @@ You can apply this method on cell cluster instead of your all set of cells by using quickCluster. It defines cluster using distances based on Spearman correlation on counts between cells, there is two available methods : - - *hclust* : hierarchical clustering on the distance matrix and dynamic tree cut. - - *igraph* : constructs a Shared Nearest Neighbor graph (SNN) on the distance matrix and identifies highly connected communities. - +- *hclust* : hierarchical clustering on the distance matrix and dynamic tree cut. +- *igraph* : constructs a Shared Nearest Neighbor graph (SNN) on the distance matrix and identifies highly connected communities. Note: First header row must NOT start with a '#' comment character