statistics¶

class
lifelines.statistics.
StatisticalResult
(p_value, test_statistic, name=None, **kwargs)¶ Bases:
object
This class holds the result of statistical tests with a nice printer wrapper to display the results.
Note
This class’ API changed in version 0.16.0.
Parameters:  p_value (iterable or float) – the pvalues of a statistical test(s)
 test_statistic (iterable or float) – the test statistics of a statistical test(s). Must be the same size as pvalues if iterable.
 name (iterable or string) – if this class holds multiple results (ex: from a pairwise comparison), this can hold the names. Must be the same size as pvalues if iterable.
 kwargs – additional information to attach to the object and display in
print_summary()
.

print_summary
(decimals=2, **kwargs)¶ Print summary statistics describing the fit, the coefficients, and the error bounds.
Parameters:  decimals (int, optional (default=2)) – specify the number of decimal places to show
 kwargs – print additional meta data in the output (useful to provide model names, dataset names, etc.) when comparing multiple outputs.

summary
¶ returns: a DataFrame containing the test statistics and the pvalue :rtype: DataFrame

lifelines.statistics.
logrank_test
(durations_A, durations_B, event_observed_A=None, event_observed_B=None, t_0=1, **kwargs)¶ Measures and reports on whether two intensity processes are different. That is, given two event series, determines whether the data generating processes are statistically different. The teststatistic is chisquared under the null hypothesis. Let \(h_i(t)\) be the hazard ratio of group \(i\) at time \(t\), then:
\[\begin{split}\begin{align} & H_0: h_1(t) = h_2(t) \\ & H_A: h_1(t) = c h_2(t), \;\; c \ne 1 \end{align}\end{split}\]This implicitly uses the logrank weights.
Note
The logrank test has maximum power when the assumption of proportional hazards is true. As a consequence, if the survival curves cross, the logrank test will give an inaccurate assessment of differences.
Parameters:  durations_A (iterable) – a (n,) listlike of event durations (birth to death,…) for the first population.
 durations_B (iterable) – a (n,) listlike of event durations (birth to death,…) for the second population.
 event_observed_A (iterable, optional) – a (n,) listlike of censorship flags, (1 if observed, 0 if not), for the first population. Default assumes all observed.
 event_observed_B (iterable, optional) – a (n,) listlike of censorship flags, (1 if observed, 0 if not), for the second population. Default assumes all observed.
 t_0 (float, optional (default=1)) – the final time period under observation, 1 for all time.
 kwargs – add keywords and metadata to the experiment summary
Returns: a StatisticalResult object with properties
p_value
,summary
,test_statistic
,print_summary
Return type: Examples
>>> T1 = [1, 4, 10, 12, 12, 3, 5.4] >>> E1 = [1, 0, 1, 0, 1, 1, 1] >>> >>> T2 = [4, 5, 7, 11, 14, 20, 8, 8] >>> E2 = [1, 1, 1, 1, 1, 1, 1, 1] >>> >>> from lifelines.statistics import logrank_test >>> results = logrank_test(T1, T2, event_observed_A=E1, event_observed_B=E2) >>> >>> results.print_summary() >>> print(results.p_value) # 0.7676 >>> print(results.test_statistic) # 0.0872
Notes
This is a special case of the function
multivariate_logrank_test
, which is used internally. See Survival and Event Analysis, page 108.

lifelines.statistics.
multivariate_logrank_test
(event_durations, groups, event_observed=None, t_0=1, **kwargs)¶ This test is a generalization of the logrank_test: it can deal with n>2 populations (and should be equal when n=2):
\[\begin{split}\begin{align} & H_0: h_1(t) = h_2(t) = h_3(t) = ... = h_n(t) \\ & H_A: \text{there exist at least one group that differs from the other.} \end{align}\end{split}\]Parameters:  event_durations (iterable) – a (n,) listlike representing the (possibly partial) durations of all individuals
 groups (iterable) – a (n,) listlike of unique group labels for each individual.
 event_observed (iterable, optional) – a (n,) listlike of event_observed events: 1 if observed death, 0 if censored. Defaults to all observed.
 t_0 (float, optional (default=1)) – the period under observation, 1 for all time.
 kwargs – add keywords and metadata to the experiment summary.
Returns: a StatisticalResult object with properties
p_value
,summary
,test_statistic
,print_summary
Return type: Examples
>>> df = pd.DataFrame({ >>> 'durations': [5, 3, 9, 8, 7, 4, 4, 3, 2, 5, 6, 7], >>> 'events': [1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0], >>> 'groups': [0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2] >>> }) >>> result = multivariate_logrank_test(df['durations'], df['groups'], df['events']) >>> result.test_statistic >>> result.p_value >>> result.print_summary()
>>> # numpy example >>> G = [0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2] >>> T = [5, 3, 9, 8, 7, 4, 4, 3, 2, 5, 6, 7] >>> E = [1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0] >>> result = multivariate_logrank_test(T, G, E) >>> result.test_statistic
See also

lifelines.statistics.
pairwise_logrank_test
(event_durations, groups, event_observed=None, t_0=1, **kwargs)¶ Perform the logrank test pairwise for all \(n \ge 2\) unique groups.
Parameters:  event_durations (iterable) – a (n,) listlike representing the (possibly partial) durations of all individuals
 groups (iterable) – a (n,) listlike of unique group labels for each individual.
 event_observed (iterable, optional) – a (n,) listlike of event_observed events: 1 if observed death, 0 if censored. Defaults to all observed.
 t_0 (float, optional (default=1)) – the period under observation, 1 for all time.
 kwargs – add keywords and metadata to the experiment summary.
Returns: a StatisticalResult object that contains all the pairwise comparisons (try
StatisticalResult.summary
orStatisticalResult.print_summarty
)Return type: See also

lifelines.statistics.
survival_difference_at_fixed_point_in_time_test
(point_in_time, durations_A, durations_B, event_observed_A=None, event_observed_B=None, **kwargs)¶ Often analysts want to compare the survivalness of groups at specific times, rather than comparing the entire survival curves against each other. For example, analysts may be interested in 5year survival. Statistically comparing the naive KaplanMeier points at a specific time actually has reduced power (see [1]). By transforming the KaplanMeier curve, we can recover more power. This function uses the log(log) transformation.
Parameters:  point_in_time (float,) – the point in time to analyze the survival curves at.
 durations_A (iterable) – a (n,) listlike of event durations (birth to death,…) for the first population.
 durations_B (iterable) – a (n,) listlike of event durations (birth to death,…) for the second population.
 event_observed_A (iterable, optional) – a (n,) listlike of censorship flags, (1 if observed, 0 if not), for the first population. Default assumes all observed.
 event_observed_B (iterable, optional) – a (n,) listlike of censorship flags, (1 if observed, 0 if not), for the second population. Default assumes all observed.
 kwargs – add keywords and metadata to the experiment summary
Returns: a StatisticalResult object with properties
p_value
,summary
,test_statistic
,print_summary
Return type: Examples
>>> T1 = [1, 4, 10, 12, 12, 3, 5.4] >>> E1 = [1, 0, 1, 0, 1, 1, 1] >>> >>> T2 = [4, 5, 7, 11, 14, 20, 8, 8] >>> E2 = [1, 1, 1, 1, 1, 1, 1, 1] >>> >>> from lifelines.statistics import survival_difference_at_fixed_point_in_time_test >>> results = survival_difference_at_fixed_point_in_time_test(12, T1, T2, event_observed_A=E1, event_observed_B=E2) >>> >>> results.print_summary() >>> print(results.p_value) # 0.893 >>> print(results.test_statistic) # 0.017
Notes
Other transformations are possible, but Klein et al. [1] showed that the log(log(c)) transform has the most desirable statistical properties.
References
[1] Klein, J. P., Logan, B. , Harhoff, M. and Andersen, P. K. (2007), Analyzing survival curves at a fixed point in time. Statist. Med., 26: 45054519. doi:10.1002/sim.2864

lifelines.statistics.
proportional_hazard_test
(fitted_cox_model, training_df, time_transform='rank', precomputed_residuals=None, **kwargs)¶ Test whether any variable in a Cox model breaks the proportional hazard assumption.
Parameters:  fitted_cox_model (CoxPHFitter) – the fitted Cox model, fitted with training_df, you wish to test. Currently only the CoxPHFitter is supported, but later CoxTimeVaryingFitter, too.
 training_df (DataFrame) – the DataFrame used in the call to the Cox model’s
fit
.  time_transform (vectorized function, list, or string, optional (default=’rank’)) – {‘all’, ‘km’, ‘rank’, ‘identity’, ‘log’} One of the strings above, a list of strings, or a function to transform the time (must accept (time, durations, weights) however). ‘all’ will present all the transforms.
 precomputed_residuals (DataFrame, optional) – specify the residuals, if already computed.
 kwargs – additional parameters to add to the StatisticalResult
Returns: Return type: Notes
R uses the default km, we use rank, as this performs well versus other transforms. See http://eprints.lse.ac.uk/84988/1/06_ParkHendry2015ReassessingSchoenfeldTests_Final.pdf

lifelines.statistics.
power_under_cph
(n_exp, n_con, p_exp, p_con, postulated_hazard_ratio, alpha=0.05)¶ This computes the power of the hypothesis test that the two groups, experiment and control, have different hazards (that is, the relative hazard ratio is different from 1.)
Parameters:  n_exp (integer) – size of the experiment group.
 n_con (integer) – size of the control group.
 p_exp (float) – probability of failure in experimental group over period of study.
 p_con (float) – probability of failure in control group over period of study
 postulated_hazard_ratio (float)
 the postulated hazard ratio
 alpha (float, optional (default=0.05)) – type I error rate
Returns: power to detect the magnitude of the hazard ratio as small as that specified by postulated_hazard_ratio.
Return type: float
Notes
See also

lifelines.statistics.
sample_size_necessary_under_cph
(power, ratio_of_participants, p_exp, p_con, postulated_hazard_ratio, alpha=0.05)¶ This computes the sample size for needed power to compare two groups under a Cox Proportional Hazard model.
Parameters:  power (float) – power to detect the magnitude of the hazard ratio as small as that specified by postulated_hazard_ratio.
 ratio_of_participants (ratio of participants in experimental group over control group.)
 p_exp (float) – probability of failure in experimental group over period of study.
 p_con (float) – probability of failure in control group over period of study
 postulated_hazard_ratio (float) – the postulated hazard ratio
 alpha (float, optional (default=0.05)) – type I error rate
Returns:  n_exp (integer) – the samples sizes need for the experiment to achieve desired power
 n_con (integer) – the samples sizes need for the control group to achieve desired power
Examples
>>> from lifelines.statistics import sample_size_necessary_under_cph >>> >>> desired_power = 0.8 >>> ratio_of_participants = 1. >>> p_exp = 0.25 >>> p_con = 0.35 >>> postulated_hazard_ratio = 0.7 >>> n_exp, n_con = sample_size_necessary_under_cph(desired_power, ratio_of_participants, p_exp, p_con, postulated_hazard_ratio) >>> # (421, 421)
References
https://cran.rproject.org/web/packages/powerSurvEpi/powerSurvEpi.pdf
See also