WeibullFitter

class lifelines.fitters.weibull_fitter.WeibullFitter(*args, **kwargs)

Bases: lifelines.fitters.KnownModelParametricUnivariateFitter

This class implements a Weibull model for univariate data. The model has parameterized form:

\[S(t) = \exp\left(-\left(\frac{t}{\lambda}\right)^\rho\right), \lambda > 0, \rho > 0,\]

The \(\lambda\) (scale) parameter has an applicable interpretation: it represents the time when 63.2% of the population has died. The \(\rho\) (shape) parameter controls if the cumulative hazard (see below) is convex or concave, representing accelerating or decelerating hazards.

fitters/univariate/images/weibull_parameters.png

The cumulative hazard rate is

\[H(t) = \left(\frac{t}{\lambda}\right)^\rho,\]

and the hazard rate is:

\[h(t) = \frac{\rho}{\lambda}\left(\frac{t}{\lambda}\right)^{\rho-1}\]

After calling the .fit method, you have access to properties like: cumulative_hazard_, survival_function_, lambda_ and rho_. A summary of the fit is available with the method print_summary().

Parameters:alpha (float, optional (default=0.05)) – the level in the confidence intervals.

Important

The parameterization of this model changed in lifelines 0.19.0. Previously, the cumulative hazard looked like \((\lambda t)^\rho\). The parameterization is now the reciprocal of \(\lambda\).

Examples

from lifelines import WeibullFitter
from lifelines.datasets import load_waltons
waltons = load_waltons()
wbf = WeibullFitter()
wbf.fit(waltons['T'], waltons['E'])
wbf.plot()
print(wbf.lambda_)
cumulative_hazard_

The estimated cumulative hazard (with custom timeline if provided)

Type:DataFrame
hazard_

The estimated hazard (with custom timeline if provided)

Type:DataFrame
survival_function_

The estimated survival function (with custom timeline if provided)

Type:DataFrame
cumulative_density_

The estimated cumulative density function (with custom timeline if provided)

Type:DataFrame
density

The estimated density function (PDF) (with custom timeline if provided)

Type:DataFrame
variance_matrix_

The variance matrix of the coefficients

Type:numpy array
median_survival_time_

The median time to event

Type:float
lambda_

The fitted parameter in the model

Type:float
rho_

The fitted parameter in the model

Type:float
durations

The durations provided

Type:array
event_observed

The event_observed variable provided

Type:array
timeline

The time line to use for plotting and indexing

Type:array
entry

The entry array provided, or None

Type:array or None

Notes

Looking for a 3-parameter Weibull model? See notes here.

AIC_
conditional_time_to_event_

Return a DataFrame, with index equal to survival_function_, that estimates the median duration remaining until the death event, given survival up until time t. For example, if an individual exists until age 1, their expected life remaining given they lived to time 1 might be 9 years.

confidence_interval_

The confidence interval of the cumulative hazard. This is an alias for confidence_interval_cumulative_hazard_.

confidence_interval_cumulative_density_

The lower and upper confidence intervals for the cumulative density

confidence_interval_cumulative_hazard_

The confidence interval of the cumulative hazard. This is an alias for confidence_interval_.

confidence_interval_density_

The confidence interval of the hazard.

confidence_interval_hazard_

The confidence interval of the hazard.

confidence_interval_survival_function_

The lower and upper confidence intervals for the survival function

cumulative_density_at_times(times, label=None) → pandas.core.series.Series

Return a Pandas series of the predicted cumulative density function (1-survival function) at specific times.

Parameters:
  • times (iterable or float) – values to return the survival function at.
  • label (string, optional) – Rename the series returned. Useful for plotting.
cumulative_hazard_at_times(times, label=None) → pandas.core.series.Series

Return a Pandas series of the predicted cumulative hazard value at specific times.

Parameters:
  • times (iterable or float) – values to return the cumulative hazard at.
  • label (string, optional) – Rename the series returned. Useful for plotting.
density_at_times(times, label=None) → pandas.core.series.Series

Return a Pandas series of the predicted probability density function, dCDF/dt, at specific times.

Parameters:
  • times (iterable or float) – values to return the survival function at.
  • label (string, optional) – Rename the series returned. Useful for plotting.
divide(other) → pandas.core.frame.DataFrame

Divide the {0} of two {1} objects.

Parameters:other (same object as self)
event_table
fit(durations, event_observed=None, timeline=None, label=None, alpha=None, ci_labels=None, show_progress=False, entry=None, weights=None, initial_point=None) → self
Parameters:
  • durations (an array, or pd.Series) – length n, duration subject was observed for
  • event_observed (numpy array or pd.Series, optional) – length n, True if the the death was observed, False if the event was lost (right-censored). Defaults all True if event_observed==None
  • timeline (list, optional) – return the estimate at the values in timeline (positively increasing)
  • label (string, optional) – a string to name the column of the estimate.
  • alpha (float, optional) – the alpha value in the confidence intervals. Overrides the initializing alpha for this call to fit only.
  • ci_labels (list, optional) – add custom column names to the generated confidence intervals as a length-2 list: [<lower-bound name>, <upper-bound name>]. Default: <label>_lower_<alpha>
  • show_progress (bool, optional) – since this is an iterative fitting algorithm, switching this to True will display some iteration details.
  • entry (an array, or pd.Series, of length n) – relative time when a subject entered the study. This is useful for left-truncated (not left-censored) observations. If None, all members of the population entered study when they were “born”: time zero.
  • weights (an array, or pd.Series, of length n) – integer weights per observation
  • initial_point ((d,) numpy array, optional) – initialize the starting point of the iterative algorithm. Default is the zero vector.
Returns:

self with new properties like cumulative_hazard_, survival_function_

Return type:

self

fit_interval_censoring(lower_bound, upper_bound, event_observed=None, timeline=None, label=None, alpha=None, ci_labels=None, show_progress=False, entry=None, weights=None, initial_point=None) → self

Fit the model to an interval censored dataset.

Parameters:
  • lower_bound (an array, or pd.Series) – length n, the start of the period the subject experienced the event in.
  • upper_bound (an array, or pd.Series) – length n, the end of the period the subject experienced the event in. If the value is equal to the corresponding value in lower_bound, then the individual’s event was observed (not censored).
  • event_observed (numpy array or pd.Series, optional) – length n, if left optional, infer from lower_bound and upper_cound (if lower_bound==upper_bound then event observed, if lower_bound < upper_bound, then event censored)
  • timeline (list, optional) – return the estimate at the values in timeline (positively increasing)
  • label (string, optional) – a string to name the column of the estimate.
  • alpha (float, optional) – the alpha value in the confidence intervals. Overrides the initializing alpha for this call to fit only.
  • ci_labels (list, optional) – add custom column names to the generated confidence intervals as a length-2 list: [<lower-bound name>, <upper-bound name>]. Default: <label>_lower_<alpha>
  • show_progress (bool, optional) – since this is an iterative fitting algorithm, switching this to True will display some iteration details.
  • entry (an array, or pd.Series, of length n) – relative time when a subject entered the study. This is useful for left-truncated (not left-censored) observations. If None, all members of the population entered study when they were “born”: time zero.
  • weights (an array, or pd.Series, of length n) – integer weights per observation
  • initial_point ((d,) numpy array, optional) – initialize the starting point of the iterative algorithm. Default is the zero vector.
Returns:

self with new properties like cumulative_hazard_, survival_function_

Return type:

self

fit_left_censoring(durations, event_observed=None, timeline=None, label=None, alpha=None, ci_labels=None, show_progress=False, entry=None, weights=None, initial_point=None) → self

Fit the model to a left-censored dataset

Parameters:
  • durations (an array, or pd.Series) – length n, duration subject was observed for
  • event_observed (numpy array or pd.Series, optional) – length n, True if the the death was observed, False if the event was lost (right-censored). Defaults all True if event_observed==None
  • timeline (list, optional) – return the estimate at the values in timeline (positively increasing)
  • label (string, optional) – a string to name the column of the estimate.
  • alpha (float, optional) – the alpha value in the confidence intervals. Overrides the initializing alpha for this call to fit only.
  • ci_labels (list, optional) – add custom column names to the generated confidence intervals as a length-2 list: [<lower-bound name>, <upper-bound name>]. Default: <label>_lower_<alpha>
  • show_progress (bool, optional) – since this is an iterative fitting algorithm, switching this to True will display some iteration details.
  • entry (an array, or pd.Series, of length n) – relative time when a subject entered the study. This is useful for left-truncated (not left-censored) observations. If None, all members of the population entered study when they were “born”: time zero.
  • weights (an array, or pd.Series, of length n) – integer weights per observation
  • initial_point ((d,) numpy array, optional) – initialize the starting point of the iterative algorithm. Default is the zero vector.
Returns:

Return type:

self with new properties like cumulative_hazard_, survival_function_

fit_right_censoring(*args, **kwargs)

Alias for fit

See also

fit

hazard_at_times(times, label=None) → pandas.core.series.Series

Return a Pandas series of the predicted hazard at specific times.

Parameters:
  • times (iterable or float) – values to return the hazard at.
  • label (string, optional) – Rename the series returned. Useful for plotting.
median_survival_time_

Return the unique time point, t, such that S(t) = 0.5. This is the “half-life” of the population, and a robust summary statistic for the population, if it exists.

percentile(p) → float

Return the unique time point, t, such that S(t) = p.

Parameters:p (float)
plot(**kwargs)

Produce a pretty-plot of the estimate.

plot_cumulative_density(**kwargs)
plot_cumulative_hazard(**kwargs)
plot_density(**kwargs)
plot_hazard(**kwargs)
plot_survival_function(**kwargs)
predict(times: Union[Iterable[float], float], interpolate=False) → pandas.core.series.Series

Predict the {0} at certain point in time. Uses a linear interpolation if points in time are not in the index.

Parameters:
  • times (scalar, or array) – a scalar or an array of times to predict the value of {0} at.
  • interpolate (bool, optional (default=False)) – for methods that produce a stepwise solution (Kaplan-Meier, Nelson-Aalen, etc), turning this to True will use an linear interpolation method to provide a more “smooth” answer.
print_summary(decimals=2, style=None, **kwargs)

Print summary statistics describing the fit, the coefficients, and the error bounds.

Parameters:
  • decimals (int, optional (default=2)) – specify the number of decimal places to show
  • style (string) – {html, ascii, latex}
  • kwargs – print additional metadata in the output (useful to provide model names, dataset names, etc.) when comparing multiple outputs.
subtract(other) → pandas.core.frame.DataFrame

Subtract the {0} of two {1} objects.

Parameters:other (same object as self)
summary

Summary statistics describing the fit.

See also

print_summary

survival_function_at_times(times, label=None) → pandas.core.series.Series

Return a Pandas series of the predicted survival value at specific times.

Parameters:
  • times (iterable or float) – values to return the survival function at.
  • label (string, optional) – Rename the series returned. Useful for plotting.