Mercurial > repos > cristian > notos
annotate readme.rst @ 1:cb8bac9d0d37 draft default tip
planemo upload commit 4ec21642211c1fb7427e4c98fdf0f4b9a3f8a185-dirty
author | cristian |
---|---|
date | Thu, 07 Sep 2017 10:21:45 -0400 |
parents | 1535ffddeff4 |
children |
rev | line source |
---|---|
0
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
1 Notos |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
2 ===== |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
3 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
4 Notos is a suite that calculates CpN o/e ratios (e.g., the commonly used CpG o/e ratios) for a set of nucleotide sequences and uses Kernel Density Estimation (KDE) to model the obtained distribution. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
5 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
6 It consists of two programs, CpGoe.pl is used to calculate the CpN o/e ratios and KDEanalysis.r estimates the model. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
7 In the following, these two programs are described briefly. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
8 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
9 CpGoe.pl |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
10 -------- |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
11 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
12 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
13 This program will calculate CpN o/e ratios on nucleotide multifasta files. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
14 For each sequence that is found in the file it will output the sequence name followed by the CpN o/e ratio, where N can be any of the nucletides A, C, G or T, into a TAB separated file. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
15 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
16 An example call would be: |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
17 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
18 perl CpGoe.pl -f input_species.fasta -a 1 -c CpG -o input_species_cpgoe.csv -m 200 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
19 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
20 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
21 The available contexts (-c) are CpG, CpA, CpC, CpT. Default is CpG. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
22 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
23 The available algorithms (-a) for calculating the CpNo/e ratio are the following (here shown for CpG o/e):: |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
24 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
25 1 => (CpG / (C * G)) * (L^2 / L-1) |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
26 2 => (CpG / (C * G)) * L |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
27 3 => (CpG / L) / ((C + G) / L)^2 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
28 4 => (CpG / (C + G)/2)^2 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
29 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
30 Here L denotes the length of the sequence, CpG represents the count of CG dinucleotide, C and G represent the count for the respective bases. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
31 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
32 KDEanalysis.r |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
33 ------------- |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
34 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
35 This program carries out two steps. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
36 First, the data preparation step, mainly to remove data artifacts. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
37 Secondly, the mode detection step, which is baesd on a KDE modelling approach. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
38 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
39 Example basic usage on command line: |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
40 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
41 Rscript ~/src/github/notos/KDEanalysis.r "Input species" input_species_cpgoe.csv |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
42 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
43 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
44 In the above case "Input species" will be used to name the graphs that are generated as well as an identifier for each sample. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
45 It has to be surrounded by " if the name of the species contains spaces. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
46 The input of KDEanalysis.r is of the same format as the output of CpGoe.pl. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
47 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
48 Any of the following parameters can be used |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
49 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
50 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
51 | Option | Long option | Description | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
52 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
53 | -o | --frac-outl | maximum fraction of CpGo/e ratios excluded as outliers [default 0.01] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
54 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
55 | -d | --min-dist | minimum distance between modes, modes that are closer are joined [default 0.2] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
56 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
57 | -c | --conf-level | level of the confidence intervals of the mode positions [default 0.95] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
58 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
59 | -m | --mode-mass | minimum probability mass of a mode [default 0.05] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
60 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
61 | -b | --band-width | bandwidth constant for kernels [default 1.06] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
62 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
63 | -B | --bootstrap | calculate confidence intervals of mode positions using bootstrap. | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
64 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
65 | -r | --bootstrap-reps | number of bootstrap repetitions [default 1500] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
66 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
67 | -p | --peak-file | name of the output file describing the peaks of the KDE [default modes_basic_stats.csv] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
68 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
69 | -s | --bootstrap-file | Name of the output file with bootstrap values [default "modes_bootstrap.csv"] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
70 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
71 | -H | --outlier-hist-file | Outliers histogram file [default outliers_hist.pdf] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
72 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
73 | -C | --cutoff-file | Outliers cutoff file [default outliers_cutoff.csv] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
74 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
75 | -k | --kde-file | Kernel density estimation graph [default KDE.pdf] | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
76 +--------+---------------------+-----------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
77 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
78 Of special interest is the -B parameter that will trigger the bootstrap calculations. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
79 Default settings have been thoroughly calibrated through extensive testing, so we would advice to modify them only if you know what you are doing. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
80 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
81 Output: Both the data preparation and the mode detection step return results in form of CSV files and figures to the user. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
82 The two figures illustrate the results of the data cleaning and mode detection step, respectively. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
83 The contents of the CSV files is described in the following. |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
84 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
85 1. outliers_cutoff.csv. The columns of this file contain |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
86 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
87 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
88 | Column | description | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
89 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
90 | Name | name of the file analyzed | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
91 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
92 | prop.zero | proportion of observations equal to zero excluded (relative to original sample) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
93 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
94 | prop.out.2iqr | proportion of values equal excluded if 2 * IQR was used, relative to sample after exclusion of zeros (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
95 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
96 | prop.out.3iqr | proportion of values equal excluded if 3 * IQR was used, relative to sample after exclusion of zeros (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
97 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
98 | prop.out.4iqr | proportion of values equal excluded if 4 * IQR was used, relative to sample after exclusion of zeros (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
99 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
100 | prop.out.5iqr | proportion of values equal excluded if 5 * IQR was used, relative to sample after exclusion of zeros (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
101 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
102 | used | IQR used for exclusion of outliers / extreme values | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
103 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
104 | no.obs.raw | number of observations in the original sample | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
105 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
106 | no.obs.nozero | number of observations in sample after excluding values equal to zero | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
107 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
108 | no.obs.clean | number of observations in sample after excluding outliers / extreme values | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
109 +---------------+----------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
110 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
111 2. modes_basic_stats.csv. We use the following notation: sigma - standard deviation, mu - mean, nu - median, Mo - mode, Q_i - the i-th quartile, q_s - the s % quantile. The columns of this file contain |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
112 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
113 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
114 | Column | description | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
115 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
116 | Name | name of the file analyzed | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
117 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
118 | Number of modes | number of modes without applying any exclusion criterion | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
119 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
120 | Number of modes (5% excluded) | number of modes after exclusion of those with less then 5% probability mass | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
121 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
122 | Number of modes (10% excluded) | number of modes after exclusion of those with less then 10% probability mass | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
123 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
124 | Skewness | Pearson's moment coefficient of skewness E(X-mu/sigma)^3 | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
125 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
126 | Mode skewness | Pearson's first skewness coefficient (mu - Mo)/sigma | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
127 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
128 | Nonparametric skew | (mu - nu)/sigma | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
129 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
130 | Q50 skewness | Bowley's measure of skewness / Yule's coefficient (Q_3 + Q_1 - 2Q_2) / (Q_3 - Q_1) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
131 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
132 | Absolute Q50 mode skewness | (Q_3 + Q_1) / 2 - Mo | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
133 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
134 | Absolute Q80 mode skewness | (q_90 + q_10) / 2 - Mo | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
135 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
136 | Peak i, i = 1,..., 10 | location of peak i | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
137 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
138 | Probability Mass i, i = 1,..., 10 | probability mass assigned to peak i | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
139 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
140 | Warning close modes | flag indicating that modes lie too close. The default threshold is 0.2 | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
141 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
142 | Number close modes | number of modes lying too close, given the threshold | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
143 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
144 | Modes (close modes excluded) | number of modes after exclusion of modes that are too close | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
145 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
146 | SD | sample standard deviation sigma | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
147 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
148 | IQR 80 | 80% distance between the 90 % and 10 % quantile | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
149 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
150 | IQR 90 | 90% distance between the 95 % and 5 % quantile | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
151 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
152 | Total number of sequences | total number of sequences / CpG o/e ratios used for this analysis step | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
153 +-----------------------------------+------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
154 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
155 3. modes_bootstrap.csv. The columns of this optional file resulting from the bootstrap procedure contains: |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
156 |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
157 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
158 | Column | description | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
159 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
160 | Name | name of the file analyzed | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
161 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
162 | Number of modes (NM) | number of modes detected for the original sample | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
163 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
164 | % of samples with same NM | proportion of bootstrap samples with the same number of modes (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
165 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
166 | % of samples with more NM | proportion of bootstrap samples a higher number of modes (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
167 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
168 | % of samples with less NM | proportion of bootstrap samples a lower number of modes (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
169 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
170 | no. of samples with same NM | number of bootstrap samples with the same number of modes | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
171 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
172 | % BS samples excluded by prob.~mass crit. | proportion of bootstrap samples excluded due to strong deviations from the probability masses determined for the original sample (0 - 100) | |
1535ffddeff4
planemo upload commit a7ac27de550a07fd6a3e3ea3fb0de65f3a10a0e6-dirty
cristian
parents:
diff
changeset
|
173 +-------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+ |