Mercurial > repos > fubar > lifelines_km_cph_tool
view lifelines_tool/test-data/readme_sample @ 2:dd5e65893cb8 draft default tip
add survival and collapsed life table outputs suggested by Wolfgang
author | fubar |
---|---|
date | Thu, 10 Aug 2023 22:52:45 +0000 |
parents | 232b874046a7 |
children |
line wrap: on
line source
## Lifelines tool Input data header = Index(['Unnamed: 0', 'week', 'arrest', 'fin', 'age', 'race', 'wexp', 'mar', 'paro', 'prio'], dtype='object') time column = week status column = arrest Logrank test for race - 0 vs 1 <lifelines.StatisticalResult: logrank_test> t_0 = -1 null_distribution = chi squared degrees_of_freedom = 1 alpha = 0.99 test_name = logrank_test --- test_statistic p -log2(p) 0.58 0.45 1.16 #### Survival table using time week and event arrest removed observed censored entrance at_risk event_at 0.0 0 0 0 432 432 1.0 1 1 0 0 432 2.0 1 1 0 0 431 3.0 1 1 0 0 430 4.0 1 1 0 0 429 5.0 1 1 0 0 428 6.0 1 1 0 0 427 7.0 1 1 0 0 426 8.0 5 5 0 0 425 9.0 2 2 0 0 420 10.0 1 1 0 0 418 11.0 2 2 0 0 417 12.0 2 2 0 0 415 13.0 1 1 0 0 413 14.0 3 3 0 0 412 15.0 2 2 0 0 409 16.0 2 2 0 0 407 17.0 3 3 0 0 405 18.0 3 3 0 0 402 19.0 2 2 0 0 399 20.0 5 5 0 0 397 21.0 2 2 0 0 392 22.0 1 1 0 0 390 23.0 1 1 0 0 389 24.0 4 4 0 0 388 25.0 3 3 0 0 384 26.0 3 3 0 0 381 27.0 2 2 0 0 378 28.0 2 2 0 0 376 30.0 2 2 0 0 374 31.0 1 1 0 0 372 32.0 2 2 0 0 371 33.0 2 2 0 0 369 34.0 2 2 0 0 367 35.0 4 4 0 0 365 36.0 3 3 0 0 361 37.0 4 4 0 0 358 38.0 1 1 0 0 354 39.0 2 2 0 0 353 40.0 4 4 0 0 351 42.0 2 2 0 0 347 43.0 4 4 0 0 345 44.0 2 2 0 0 341 45.0 2 2 0 0 339 46.0 4 4 0 0 337 47.0 1 1 0 0 333 48.0 2 2 0 0 332 49.0 5 5 0 0 330 50.0 3 3 0 0 325 52.0 322 4 318 0 322 #### Life table using time week and event arrest removed observed censored at_risk event_at (-0.001, 13.844] 20 20 0 432 (13.844, 27.687] 36 36 0 412 (27.687, 41.531] 29 29 0 376 (41.531, 55.374] 347 29 318 347 ### Lifelines test of Proportional Hazards results with prio, age, race, paro, mar, fin as covariates on KM and CPH in lifelines test <lifelines.CoxPHFitter: fitted with 432 total observations, 318 right-censored observations> duration col = 'week' event col = 'arrest' baseline estimation = breslow number of observations = 432 number of events observed = 114 partial log-likelihood = -659.00 time fit was run = 2023-08-10 12:00:14 UTC --- coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95% covariate prio 0.10 1.10 0.03 0.04 0.15 1.04 1.16 age -0.06 0.94 0.02 -0.10 -0.02 0.90 0.98 race 0.32 1.38 0.31 -0.28 0.92 0.75 2.52 paro -0.09 0.91 0.20 -0.47 0.29 0.62 1.34 mar -0.48 0.62 0.38 -1.22 0.25 0.30 1.29 fin -0.38 0.68 0.19 -0.75 -0.00 0.47 1.00 cmp to z p -log2(p) covariate prio 0.00 3.53 <0.005 11.26 age 0.00 -2.95 <0.005 8.28 race 0.00 1.04 0.30 1.75 paro 0.00 -0.46 0.65 0.63 mar 0.00 -1.28 0.20 2.32 fin 0.00 -1.98 0.05 4.40 --- Concordance = 0.63 Partial AIC = 1330.00 log-likelihood ratio test = 32.77 on 6 df -log2(p) of ll-ratio test = 16.39 Bootstrapping lowess lines. May take a moment... Bootstrapping lowess lines. May take a moment... The ``p_value_threshold`` is set at 0.01. Even under the null hypothesis of no violations, some covariates will be below the threshold by chance. This is compounded when there are many covariates. Similarly, when there are lots of observations, even minor deviances from the proportional hazard assumption will be flagged. With that in mind, it's best to use a combination of statistical tests and visual tests to determine the most serious violations. Produce visual plots using ``check_assumptions(..., show_plots=True)`` and looking for non-constant lines. See link [A] below for a full example. <lifelines.StatisticalResult: proportional_hazard_test> null_distribution = chi squared degrees_of_freedom = 1 model = <lifelines.CoxPHFitter: fitted with 432 total observations, 318 right-censored observations> test_name = proportional_hazard_test --- test_statistic p -log2(p) age km 6.99 0.01 6.93 rank 7.40 0.01 7.26 fin km 0.02 0.90 0.15 rank 0.01 0.91 0.13 mar km 1.64 0.20 2.32 rank 1.80 0.18 2.48 paro km 0.06 0.81 0.31 rank 0.07 0.79 0.34 prio km 0.92 0.34 1.57 rank 0.88 0.35 1.52 race km 1.70 0.19 2.38 rank 1.68 0.19 2.36 1. Variable 'age' failed the non-proportional test: p-value is 0.0065. Advice 1: the functional form of the variable 'age' might be incorrect. That is, there may be non-linear terms missing. The proportional hazard test used is very sensitive to incorrect functional forms. See documentation in link [D] below on how to specify a functional form. Advice 2: try binning the variable 'age' using pd.cut, and then specify it in `strata=['age', ...]` in the call in `.fit`. See documentation in link [B] below. Advice 3: try adding an interaction term with your time variable. See documentation in link [C] below. Bootstrapping lowess lines. May take a moment... Bootstrapping lowess lines. May take a moment... Bootstrapping lowess lines. May take a moment... Bootstrapping lowess lines. May take a moment... --- [A] https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html [B] https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Bin-variable-and-stratify-on-it [C] https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Introduce-time-varying-covariates [D] https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Modify-the-functional-form [E] https://lifelines.readthedocs.io/en/latest/jupyter_notebooks/Proportional%20hazard%20assumption.html#Stratification