annotate readme.rst @ 1:cb8bac9d0d37 draft default tip

planemo upload commit 4ec21642211c1fb7427e4c98fdf0f4b9a3f8a185-dirty
author cristian
date Thu, 07 Sep 2017 10:21:45 -0400
parents 1535ffddeff4
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
1 Notos
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
2 =====
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
3
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
4 Notos is a suite that calculates CpN o/e ratios (e.g., the commonly used CpG o/e ratios) for a set of nucleotide sequences and uses Kernel Density Estimation (KDE) to model the obtained distribution.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
5
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
6 It consists of two programs, CpGoe.pl is used to calculate the CpN o/e ratios and KDEanalysis.r estimates the model.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
7 In the following, these two programs are described briefly.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
8
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
9 CpGoe.pl
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
10 --------
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
11
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
12
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
13 This program will calculate CpN o/e ratios on nucleotide multifasta files.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
14 For each sequence that is found in the file it will output the sequence name followed by the CpN o/e ratio, where N can be any of the nucletides A, C, G or T, into a TAB separated file.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
15
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
16 An example call would be:
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
17
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
18 perl CpGoe.pl -f input_species.fasta -a 1 -c CpG -o input_species_cpgoe.csv -m 200
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
19
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
20
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
21 The available contexts (-c) are CpG, CpA, CpC, CpT. Default is CpG.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
22
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
23 The available algorithms (-a) for calculating the CpNo/e ratio are the following (here shown for CpG o/e)::
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
24
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
25 1 => (CpG / (C * G)) * (L^2 / L-1)
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
26 2 => (CpG / (C * G)) * L
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
27 3 => (CpG / L) / ((C + G) / L)^2
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
28 4 => (CpG / (C + G)/2)^2
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
29
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
30 Here L denotes the length of the sequence, CpG represents the count of CG dinucleotide, C and G represent the count for the respective bases.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
31
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
32 KDEanalysis.r
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
33 -------------
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
34
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
35 This program carries out two steps.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
36 First, the data preparation step, mainly to remove data artifacts.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
37 Secondly, the mode detection step, which is baesd on a KDE modelling approach.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
38
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
39 Example basic usage on command line:
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
40
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
41 Rscript ~/src/github/notos/KDEanalysis.r "Input species" input_species_cpgoe.csv
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
42
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
43
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
44 In the above case "Input species" will be used to name the graphs that are generated as well as an identifier for each sample.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
45 It has to be surrounded by " if the name of the species contains spaces.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
46 The input of KDEanalysis.r is of the same format as the output of CpGoe.pl.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
47
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
48 Any of the following parameters can be used
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
49
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
50 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
51 | Option | Long option | Description |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
52 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
53 | -o | --frac-outl | maximum fraction of CpGo/e ratios excluded as outliers [default 0.01] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
54 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
55 | -d | --min-dist | minimum distance between modes, modes that are closer are joined [default 0.2] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
56 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
57 | -c | --conf-level | level of the confidence intervals of the mode positions [default 0.95] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
58 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
59 | -m | --mode-mass | minimum probability mass of a mode [default 0.05] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
60 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
61 | -b | --band-width | bandwidth constant for kernels [default 1.06] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
62 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
63 | -B | --bootstrap | calculate confidence intervals of mode positions using bootstrap. |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
64 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
65 | -r | --bootstrap-reps | number of bootstrap repetitions [default 1500] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
66 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
67 | -p | --peak-file | name of the output file describing the peaks of the KDE [default modes_basic_stats.csv] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
68 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
69 | -s | --bootstrap-file | Name of the output file with bootstrap values [default "modes_bootstrap.csv"] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
70 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
71 | -H | --outlier-hist-file | Outliers histogram file [default outliers_hist.pdf] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
72 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
73 | -C | --cutoff-file | Outliers cutoff file [default outliers_cutoff.csv] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
74 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
75 | -k | --kde-file | Kernel density estimation graph [default KDE.pdf] |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
76 +--------+---------------------+-----------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
77
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
78 Of special interest is the -B parameter that will trigger the bootstrap calculations.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
79 Default settings have been thoroughly calibrated through extensive testing, so we would advice to modify them only if you know what you are doing.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
80
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
81 Output: Both the data preparation and the mode detection step return results in form of CSV files and figures to the user.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
82 The two figures illustrate the results of the data cleaning and mode detection step, respectively.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
83 The contents of the CSV files is described in the following.
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
84
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
85 1. outliers_cutoff.csv. The columns of this file contain
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
86
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
87 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
88 | Column | description |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
89 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
90 | Name | name of the file analyzed |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
91 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
92 | prop.zero | proportion of observations equal to zero excluded (relative to original sample) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
93 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
94 | prop.out.2iqr | proportion of values equal excluded if 2 * IQR was used, relative to sample after exclusion of zeros (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
95 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
96 | prop.out.3iqr | proportion of values equal excluded if 3 * IQR was used, relative to sample after exclusion of zeros (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
97 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
98 | prop.out.4iqr | proportion of values equal excluded if 4 * IQR was used, relative to sample after exclusion of zeros (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
99 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
100 | prop.out.5iqr | proportion of values equal excluded if 5 * IQR was used, relative to sample after exclusion of zeros (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
101 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
102 | used | IQR used for exclusion of outliers / extreme values |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
103 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
104 | no.obs.raw | number of observations in the original sample |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
105 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
106 | no.obs.nozero | number of observations in sample after excluding values equal to zero |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
107 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
108 | no.obs.clean | number of observations in sample after excluding outliers / extreme values |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
109 +---------------+----------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
110
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
111 2. modes_basic_stats.csv. We use the following notation: sigma - standard deviation, mu - mean, nu - median, Mo - mode, Q_i - the i-th quartile, q_s - the s % quantile. The columns of this file contain
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
112
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
113 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
114 | Column | description |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
115 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
116 | Name | name of the file analyzed |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
117 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
118 | Number of modes | number of modes without applying any exclusion criterion |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
119 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
120 | Number of modes (5% excluded) | number of modes after exclusion of those with less then 5% probability mass |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
121 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
122 | Number of modes (10% excluded) | number of modes after exclusion of those with less then 10% probability mass |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
123 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
124 | Skewness | Pearson's moment coefficient of skewness E(X-mu/sigma)^3 |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
125 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
126 | Mode skewness | Pearson's first skewness coefficient (mu - Mo)/sigma |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
127 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
128 | Nonparametric skew | (mu - nu)/sigma |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
129 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
130 | Q50 skewness | Bowley's measure of skewness / Yule's coefficient (Q_3 + Q_1 - 2Q_2) / (Q_3 - Q_1) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
131 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
132 | Absolute Q50 mode skewness | (Q_3 + Q_1) / 2 - Mo |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
133 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
134 | Absolute Q80 mode skewness | (q_90 + q_10) / 2 - Mo |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
135 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
136 | Peak i, i = 1,..., 10 | location of peak i |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
137 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
138 | Probability Mass i, i = 1,..., 10 | probability mass assigned to peak i |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
139 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
140 | Warning close modes | flag indicating that modes lie too close. The default threshold is 0.2 |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
141 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
142 | Number close modes | number of modes lying too close, given the threshold |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
143 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
144 | Modes (close modes excluded) | number of modes after exclusion of modes that are too close |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
145 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
146 | SD | sample standard deviation sigma |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
147 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
148 | IQR 80 | 80% distance between the 90 % and 10 % quantile |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
149 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
150 | IQR 90 | 90% distance between the 95 % and 5 % quantile |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
151 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
152 | Total number of sequences | total number of sequences / CpG o/e ratios used for this analysis step |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
153 +-----------------------------------+------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
154
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
155 3. modes_bootstrap.csv. The columns of this optional file resulting from the bootstrap procedure contains:
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
156
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
157 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
158 | Column | description |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
159 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
160 | Name | name of the file analyzed |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
161 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
162 | Number of modes (NM) | number of modes detected for the original sample |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
163 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
164 | % of samples with same NM | proportion of bootstrap samples with the same number of modes (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
165 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
166 | % of samples with more NM | proportion of bootstrap samples a higher number of modes (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
167 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
168 | % of samples with less NM | proportion of bootstrap samples a lower number of modes (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
169 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
170 | no. of samples with same NM | number of bootstrap samples with the same number of modes |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
171 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
172 | % BS samples excluded by prob.~mass crit. | proportion of bootstrap samples excluded due to strong deviations from the probability masses determined for the original sample (0 - 100) |
1535ffddeff4 planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff changeset
173 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+