view lifelines_tool/run_log.txt @ 2:dd5e65893cb8 draft default tip

add survival and collapsed life table outputs suggested by Wolfgang
author fubar
date Thu, 10 Aug 2023 22:52:45 +0000
parents
children
line wrap: on
line source

## Lifelines tool
Input data header = Index(['Unnamed: 0', 'week', 'arrest', 'fin', 'age', 'race', 'wexp', 'mar',
       'paro', 'prio'],
      dtype='object') time column = week status column = arrest
#### No grouping variable, so no log rank or other Kaplan-Meier statistical output is available
Survival table using time week and event arrest
          removed  observed  censored  entrance  at_risk
event_at                                                
0.0             0         0         0       432      432
1.0             1         1         0         0      432
2.0             1         1         0         0      431
3.0             1         1         0         0      430
4.0             1         1         0         0      429
5.0             1         1         0         0      428
6.0             1         1         0         0      427
7.0             1         1         0         0      426
8.0             5         5         0         0      425
9.0             2         2         0         0      420
10.0            1         1         0         0      418
11.0            2         2         0         0      417
12.0            2         2         0         0      415
13.0            1         1         0         0      413
14.0            3         3         0         0      412
15.0            2         2         0         0      409
16.0            2         2         0         0      407
17.0            3         3         0         0      405
18.0            3         3         0         0      402
19.0            2         2         0         0      399
20.0            5         5         0         0      397
21.0            2         2         0         0      392
22.0            1         1         0         0      390
23.0            1         1         0         0      389
24.0            4         4         0         0      388
25.0            3         3         0         0      384
26.0            3         3         0         0      381
27.0            2         2         0         0      378
28.0            2         2         0         0      376
30.0            2         2         0         0      374
31.0            1         1         0         0      372
32.0            2         2         0         0      371
33.0            2         2         0         0      369
34.0            2         2         0         0      367
35.0            4         4         0         0      365
36.0            3         3         0         0      361
37.0            4         4         0         0      358
38.0            1         1         0         0      354
39.0            2         2         0         0      353
40.0            4         4         0         0      351
42.0            2         2         0         0      347
43.0            4         4         0         0      345
44.0            2         2         0         0      341
45.0            2         2         0         0      339
46.0            4         4         0         0      337
47.0            1         1         0         0      333
48.0            2         2         0         0      332
49.0            5         5         0         0      330
50.0            3         3         0         0      325
52.0          322         4       318         0      322
Life table using time week and event arrest
                  removed  observed  censored  at_risk
event_at                                              
(-0.001, 13.844]       20        20         0      432
(13.844, 27.687]       36        36         0      412
(27.687, 41.531]       29        29         0      376
(41.531, 55.374]      347        29       318      347
### Lifelines test of Proportional Hazards results with prio, age, race, paro, mar, fin as covariates on test
<lifelines.CoxPHFitter: fitted with 432 total observations, 318 right-censored observations>
             duration col = 'week'
                event col = 'arrest'
      baseline estimation = breslow
   number of observations = 432
number of events observed = 114
   partial log-likelihood = -659.00
         time fit was run = 2023-08-10 11:57:10 UTC

---
            coef  exp(coef)   se(coef)   coef lower 95%   coef upper 95%  exp(coef) lower 95%  exp(coef) upper 95%
covariate                                                                                                         
prio        0.10       1.10       0.03             0.04             0.15                 1.04                 1.16
age        -0.06       0.94       0.02            -0.10            -0.02                 0.90                 0.98
race        0.32       1.38       0.31            -0.28             0.92                 0.75                 2.52
paro       -0.09       0.91       0.20            -0.47             0.29                 0.62                 1.34
mar        -0.48       0.62       0.38            -1.22             0.25                 0.30                 1.29
fin        -0.38       0.68       0.19            -0.75            -0.00                 0.47                 1.00

            cmp to     z      p   -log2(p)
covariate                                 
prio          0.00  3.53 <0.005      11.26
age           0.00 -2.95 <0.005       8.28
race          0.00  1.04   0.30       1.75
paro          0.00 -0.46   0.65       0.63
mar           0.00 -1.28   0.20       2.32
fin           0.00 -1.98   0.05       4.40
---
Concordance = 0.63
Partial AIC = 1330.00
log-likelihood ratio test = 32.77 on 6 df
-log2(p) of ll-ratio test = 16.39


   Bootstrapping lowess lines. May take a moment...


   Bootstrapping lowess lines. May take a moment...

The ``p_value_threshold`` is set at 0.01. Even under the null hypothesis of no violations, some
covariates will be below the threshold by chance. This is compounded when there are many covariates.
Similarly, when there are lots of observations, even minor deviances from the proportional hazard
assumption will be flagged.

With that in mind, it's best to use a combination of statistical tests and visual tests to determine
the most serious violations. Produce visual plots using ``check_assumptions(..., show_plots=True)``
and looking for non-constant lines. See link [A] below for a full example.

<lifelines.StatisticalResult: proportional_hazard_test>
 null_distribution = chi squared
degrees_of_freedom = 1
             model = <lifelines.CoxPHFitter: fitted with 432 total observations, 318 right-censored observations>
         test_name = proportional_hazard_test

---
           test_statistic    p  -log2(p)
age  km              6.99 0.01      6.93
     rank            7.40 0.01      7.26
fin  km              0.02 0.90      0.15
     rank            0.01 0.91      0.13
mar  km              1.64 0.20      2.32
     rank            1.80 0.18      2.48
paro km              0.06 0.81      0.31
     rank            0.07 0.79      0.34
prio km              0.92 0.34      1.57
     rank            0.88 0.35      1.52
race km              1.70 0.19      2.38
     rank            1.68 0.19      2.36


1. Variable 'age' failed the non-proportional test: p-value is 0.0065.

   Advice 1: the functional form of the variable 'age' might be incorrect. That is, there may be
non-linear terms missing. The proportional hazard test used is very sensitive to incorrect
functional forms. See documentation in link [D] below on how to specify a functional form.

   Advice 2: try binning the variable 'age' using pd.cut, and then specify it in `strata=['age',
...]` in the call in `.fit`. See documentation in link [B] below.

   Advice 3: try adding an interaction term with your time variable. See documentation in link [C]
below.


   Bootstrapping lowess lines. May take a moment...


   Bootstrapping lowess lines. May take a moment...


   Bootstrapping lowess lines. May take a moment...


   Bootstrapping lowess lines. May take a moment...


---
[A]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html
[B]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Bin-variable-and-stratify-on-it
[C]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Introduce-time-varying-covariates
[D]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Modify-the-functional-form
[E]  https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Stratification