diff lifelines_tool/test-data/readme_sample @ 0:dd49a7040643 draft

Initial commit
author fubar
date Wed, 09 Aug 2023 11:12:16 +0000
parents
children 232b874046a7
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/lifelines_tool/test-data/readme_sample	Wed Aug 09 11:12:16 2023 +0000
@@ -0,0 +1,119 @@
+## Lifelines tool starting.
+Using data header = Index(['Unnamed: 0', 'week', 'arrest', 'fin', 'age', 'race', 'wexp', 'mar',
+       'paro', 'prio'],
+      dtype='object') time column = week status column = arrest
+Logrank test for race - 0 vs 1
+
+<lifelines.StatisticalResult: logrank_test>
+               t_0 = -1
+ null_distribution = chi squared
+degrees_of_freedom = 1
+             alpha = 0.99
+         test_name = logrank_test
+
+---
+ test_statistic    p  -log2(p)
+           0.58 0.45      1.16
+### Lifelines test of Proportional Hazards results with prio, age, race, paro, mar, fin as covariates on KM and CPH in lifelines test
+<lifelines.CoxPHFitter: fitted with 432 total observations, 318 right-censored observations>
+             duration col = 'week'
+                event col = 'arrest'
+      baseline estimation = breslow
+   number of observations = 432
+number of events observed = 114
+   partial log-likelihood = -659.00
+         time fit was run = 2023-08-09 07:43:37 UTC
+
+---
+            coef  exp(coef)   se(coef)   coef lower 95%   coef upper 95%  exp(coef) lower 95%  exp(coef) upper 95%
+covariate                                                                                                         
+prio        0.10       1.10       0.03             0.04             0.15                 1.04                 1.16
+age        -0.06       0.94       0.02            -0.10            -0.02                 0.90                 0.98
+race        0.32       1.38       0.31            -0.28             0.92                 0.75                 2.52
+paro       -0.09       0.91       0.20            -0.47             0.29                 0.62                 1.34
+mar        -0.48       0.62       0.38            -1.22             0.25                 0.30                 1.29
+fin        -0.38       0.68       0.19            -0.75            -0.00                 0.47                 1.00
+
+            cmp to     z      p   -log2(p)
+covariate                                 
+prio          0.00  3.53 <0.005      11.26
+age           0.00 -2.95 <0.005       8.28
+race          0.00  1.04   0.30       1.75
+paro          0.00 -0.46   0.65       0.63
+mar           0.00 -1.28   0.20       2.32
+fin           0.00 -1.98   0.05       4.40
+---
+Concordance = 0.63
+Partial AIC = 1330.00
+log-likelihood ratio test = 32.77 on 6 df
+-log2(p) of ll-ratio test = 16.39
+
+
+   Bootstrapping lowess lines. May take a moment...
+
+
+   Bootstrapping lowess lines. May take a moment...
+
+The ``p_value_threshold`` is set at 0.01. Even under the null hypothesis of no violations, some
+covariates will be below the threshold by chance. This is compounded when there are many covariates.
+Similarly, when there are lots of observations, even minor deviances from the proportional hazard
+assumption will be flagged.
+
+With that in mind, it's best to use a combination of statistical tests and visual tests to determine
+the most serious violations. Produce visual plots using ``check_assumptions(..., show_plots=True)``
+and looking for non-constant lines. See link [A] below for a full example.
+
+<lifelines.StatisticalResult: proportional_hazard_test>
+ null_distribution = chi squared
+degrees_of_freedom = 1
+             model = <lifelines.CoxPHFitter: fitted with 432 total observations, 318 right-censored observations>
+         test_name = proportional_hazard_test
+
+---
+           test_statistic    p  -log2(p)
+age  km              6.99 0.01      6.93
+     rank            7.40 0.01      7.26
+fin  km              0.02 0.90      0.15
+     rank            0.01 0.91      0.13
+mar  km              1.64 0.20      2.32
+     rank            1.80 0.18      2.48
+paro km              0.06 0.81      0.31
+     rank            0.07 0.79      0.34
+prio km              0.92 0.34      1.57
+     rank            0.88 0.35      1.52
+race km              1.70 0.19      2.38
+     rank            1.68 0.19      2.36
+
+
+1. Variable 'age' failed the non-proportional test: p-value is 0.0065.
+
+   Advice 1: the functional form of the variable 'age' might be incorrect. That is, there may be
+non-linear terms missing. The proportional hazard test used is very sensitive to incorrect
+functional forms. See documentation in link [D] below on how to specify a functional form.
+
+   Advice 2: try binning the variable 'age' using pd.cut, and then specify it in `strata=['age',
+...]` in the call in `.fit`. See documentation in link [B] below.
+
+   Advice 3: try adding an interaction term with your time variable. See documentation in link [C]
+below.
+
+
+   Bootstrapping lowess lines. May take a moment...
+
+
+   Bootstrapping lowess lines. May take a moment...
+
+
+   Bootstrapping lowess lines. May take a moment...
+
+
+   Bootstrapping lowess lines. May take a moment...
+
+
+---
+[A]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html
+[B]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Bin-variable-and-stratify-on-it
+[C]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Introduce-time-varying-covariates
+[D]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Modify-the-functional-form
+[E]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Stratification
+