CoxTimeVaryingFitter¶
- class lifelines.fitters.cox_time_varying_fitter.CoxTimeVaryingFitter(alpha=0.05, penalizer=0.0, l1_ratio: float = 0.0, strata=None)¶
This class implements fitting Cox’s time-varying proportional hazard model:
\[h(t|x(t)) = h_0(t)\exp((x(t)-\overline{x})'\beta)\]- Parameters:
alpha (float, optional (default=0.05)) – the level in the confidence intervals.
penalizer (float, optional) – the coefficient of an L2 penalizer in the regression
- params_¶
The estimated coefficients. Changed in version 0.22.0: use to be
.hazards_
- Type:
Series
- hazard_ratios_¶
The exp(coefficients)
- Type:
Series
- confidence_intervals_¶
The lower and upper confidence intervals for the hazard coefficients
- Type:
DataFrame
- event_observed¶
The event_observed variable provided
- Type:
Series
- weights¶
The event_observed variable provided
- Type:
Series
- variance_matrix_¶
The variance matrix of the coefficients
- Type:
DataFrame
- strata¶
the strata provided
- Type:
list | str
- standard_errors_¶
the standard errors of the estimates
- Type:
Series
- baseline_cumulative_hazard_¶
- Type:
DataFrame
- baseline_survival_¶
- Type:
DataFrame
- fit(df, event_col, start_col='start', stop_col='stop', weights_col=None, id_col=None, show_progress=False, robust=False, strata=None, initial_point=None, formula: str = None, fit_options: dict | None = None)¶
Fit the Cox Proportional Hazard model to a time varying dataset. Tied survival times are handled using Efron’s tie-method.
- Parameters:
df (DataFrame) – a Pandas DataFrame with necessary columns duration_col and event_col, plus other covariates. duration_col refers to the lifetimes of the subjects. event_col refers to whether the ‘death’ events was observed: 1 if observed, 0 else (censored).
event_col (string) – the column in DataFrame that contains the subjects’ death observation. If left as None, assume all individuals are non-censored.
start_col (string) – the column that contains the start of a subject’s time period.
stop_col (string) – the column that contains the end of a subject’s time period.
weights_col (string, optional) – the column that contains (possibly time-varying) weight of each subject-period row.
id_col (string, optional) – A subject could have multiple rows in the DataFrame. This column contains the unique identifier per subject. If not provided, it’s up to the user to make sure that there are no violations.
show_progress (since the fitter is iterative, show convergence) – diagnostics.
robust (bool, optional (default: True)) – Compute the robust errors using the Huber sandwich estimator, aka Wei-Lin estimate. This does not handle ties, so if there are high number of ties, results may significantly differ. See “The Robust Inference for the Cox Proportional Hazards Model”, Journal of the American Statistical Association, Vol. 84, No. 408 (Dec., 1989), pp. 1074- 1078
strata (list | string, optional) – specify a column or list of columns n to use in stratification. This is useful if a categorical covariate does not obey the proportional hazard assumption. This is used similar to the strata expression in R. See http://courses.washington.edu/b515/l17.pdf.
initial_point ((d,) numpy array, optional) – initialize the starting point of the iterative algorithm. Default is the zero vector.
formula (str, optional) – A R-like formula for transforming the covariates
fit_options (dict, optional) –
- Override the default values in NR algorithm:
step_size: 0.95, precision: 1e-07, r_precision=1e-9, max_steps: 500,
- Returns:
self – self, with additional properties like
hazards_
andprint_summary
- Return type:
- log_likelihood_ratio_test()¶
This function computes the likelihood ratio test for the Cox model. We compare the existing model (with all the covariates) to the trivial model of no covariates.
Conveniently, we can actually use CoxPHFitter class to do most of the work.
- plot(columns=None, ax=None, **errorbar_kwargs)¶
Produces a visual representation of the coefficients, including their standard errors and magnitudes.
- Parameters:
columns (list, optional) – specify a subset of the columns to plot
errorbar_kwargs – pass in additional plotting commands to matplotlib errorbar command
- Returns:
ax – the matplotlib axis that be edited.
- Return type:
matplotlib axis
- predict_log_partial_hazard(X) Series ¶
This is equivalent to R’s linear.predictors. Returns the log of the partial hazard for the individuals, partial since the baseline hazard is not included. Equal to \((x - \bar{x})'\beta\)
- Parameters:
X (numpy array or DataFrame) – a (n,d) covariate numpy array or DataFrame. If a DataFrame, columns can be in any order. If a numpy array, columns must be in the same order as the training data.
- Return type:
DataFrame
Note
If X is a DataFrame, the order of the columns do not matter. But if X is an array, then the column ordering is assumed to be the same as the training dataset.
- predict_partial_hazard(X) Series ¶
Returns the partial hazard for the individuals, partial since the baseline hazard is not included. Equal to \(\exp{(x - \bar{x})'\beta }\)
- Parameters:
X (numpy array or DataFrame) – a (n,d) covariate numpy array or DataFrame. If a DataFrame, columns can be in any order. If a numpy array, columns must be in the same order as the training data.
- Return type:
DataFrame
Note
If X is a DataFrame, the order of the columns do not matter. But if X is an array, then the column ordering is assumed to be the same as the training dataset.
- print_summary(decimals=2, style=None, columns=None, **kwargs)¶
Print summary statistics describing the fit, the coefficients, and the error bounds.
- Parameters:
decimals (int, optional (default=2)) – specify the number of decimal places to show
style (string) – {html, ascii, latex}
columns – only display a subset of
summary
columns. Default all.kwargs – print additional meta data in the output (useful to provide model names, dataset names, etc.) when comparing multiple outputs.
- property summary¶
Summary statistics describing the fit.
- Returns:
df – Contains columns coef, np.exp(coef), se(coef), z, p, lower, upper
- Return type:
DataFrame