LogNormalAFTFitter

class lifelines.fitters.log_normal_aft_fitter.LogNormalAFTFitter(alpha=0.05, penalizer=0.0, l1_ratio=0.0, fit_intercept=True, model_ancillary=False)

This class implements a Log-Normal AFT model. The model has parameterized form, with \(\mu(x) = a_0 + a_1x_1 + ... + a_n x_n\), and optionally, \(\sigma(y) = \exp\left(b_0 + b_1 y_1 + ... + b_m y_m \right)\),

The cumulative hazard rate is

\[H(t; x, y) = -\log\left(1 - \Phi\left(\frac{\log(T) - \mu(x)}{\sigma(y)}\right)\right)\]

After calling the .fit method, you have access to properties like: params_, print_summary(). A summary of the fit is available with the method print_summary().

Parameters:
  • alpha (float, optional (default=0.05)) – the level in the confidence intervals.

  • fit_intercept (bool, optional (default=True)) – Allow lifelines to add an intercept column of 1s to df, and ancillary if applicable.

  • penalizer (float or array, optional (default=0.0)) – the penalizer coefficient to the size of the coefficients. See l1_ratio. Must be equal to or greater than 0. Alternatively, penalizer is an array equal in size to the number of parameters, with penalty coefficients for specific variables. For example, penalizer=0.01 * np.ones(p) is the same as penalizer=0.01

  • l1_ratio (float, optional (default=0.0)) – how much of the penalizer should be attributed to an l1 penalty (otherwise an l2 penalty). The penalty function looks like penalizer * l1_ratio * ||w||_1 + 0.5 * penalizer * (1 - l1_ratio) * ||w||^2_2

  • model_ancillary (optional (default=False)) – set the model instance to always model the ancillary parameter with the supplied DataFrame. This is useful for grid-search optimization.

params_

The estimated coefficients

Type:

DataFrame

confidence_intervals_

The lower and upper confidence intervals for the coefficients

Type:

DataFrame

durations

The event_observed variable provided

Type:

Series

event_observed

The event_observed variable provided

Type:

Series

weights

The event_observed variable provided

Type:

Series

variance_matrix_

The variance matrix of the coefficients

Type:

DataFrame

standard_errors_

the standard errors of the estimates

Type:

Series

score_

the concordance index of the model.

Type:

float

predict_expectation(df: DataFrame, ancillary: DataFrame | None = None) Series

Predict the expectation of lifetimes, \(E[T | x]\).

Parameters:
  • X (numpy array or DataFrame) – a (n,d) covariate numpy array or DataFrame. If a DataFrame, columns can be in any order. If a numpy array, columns must be in the same order as the training data.

  • ancillary_X (numpy array or DataFrame, optional) – a (n,d) covariate numpy array or DataFrame. If a DataFrame, columns can be in any order. If a numpy array, columns must be in the same order as the training data.

Returns:

percentiles – the median lifetimes for the individuals. If the survival curve of an individual does not cross 0.5, then the result is infinity.

Return type:

DataFrame

See also

predict_median

predict_percentile(df: DataFrame, *, ancillary: DataFrame | None = None, p: float = 0.5, conditional_after: ndarray | None = None) Series

Returns the median lifetimes for the individuals, by default. If the survival curve of an individual does not cross p, then the result is infinity. http://stats.stackexchange.com/questions/102986/percentile-loss-functions

Parameters:
  • X (numpy array or DataFrame) – a (n,d) covariate numpy array or DataFrame. If a DataFrame, columns can be in any order. If a numpy array, columns must be in the same order as the training data.

  • ancillary_X (numpy array or DataFrame, optional) – a (n,d) covariate numpy array or DataFrame. If a DataFrame, columns can be in any order. If a numpy array, columns must be in the same order as the training data.

  • p (float, optional (default=0.5)) – the percentile, must be between 0 and 1.

  • conditional_after (iterable, optional) – Must be equal in size to df.shape[0] (denoted n above). An iterable (array, list, series) of possibly non-zero values that represent how long the subject has already lived for. Ex: if \(T\) is the unknown event time, then this represents \(T | T > s\). This is useful for knowing the remaining hazard/survival of censored subjects. The new timeline is the remaining duration of the subject, i.e. normalized back to starting at 0.

Returns:

percentiles

Return type:

DataFrame

See also

predict_median