An Introduction to Survival Analysis Using Stata, Third Edition
by Mario Cleves, William Gould, Roberto G. Gutierrez, and Yulia V. Marchenko
 An
Introduction to Survival Analysis Using Stata, Third Edition is the
ideal tutorial for professional data analysts who want to learn
survival analysis for the first time or who are well versed in survival
analysis but are not as dexterous in using Stata to analyze survival
data. This text also serves as a valuable reference to those readers
who already have experience using Stata’s survival analysis routines.
The third edition has been updated for Stata 11, and it includes a new
chapter on competing-risks analysis. This chapter describes the
problems posed by competing events (events that impede the failure
event of interest), and covers estimation of cause-specific hazards and
cumulative incidence functions. Other enhancements include the handling
of missing values by multiple imputation in Cox regression, a
new-to-Stata-11 system for specifying categorical (factor) variables
and their interactions, three additional diagnostic measures for Cox
regression, and a more efficient syntax for obtaining predictions and
diagnostics after Cox regression.
Survival analysis is a field of its own that requires specialized data
management and analysis procedures. To meet this requirement, Stata
provides the st family of commands for organizing and summarizing
survival data. The authors of this text are also the authors of Stata’s
st commands.
This book provides statistical theory, step-by-step procedures for
analyzing survival data, an in-depth usage guide for Stata’s most
widely used st commands, and a collection of tips for using Stata to
analyze survival data and to present the results. This book develops
from first principles the statistical concepts unique to survival data
and assumes only a knowledge of basic probability and statistics and a
working knowledge of Stata.
The first three chapters of the text cover basic theoretical concepts:
hazard functions, cumulative hazard functions, and their
interpretations; survivor functions; hazard models; and a comparison of
nonparametric, semiparametric, and parametric methodologies. Chapter 4
deals with censoring and truncation. The next three chapters cover the
formatting, manipulation, stsetting, and error checking involved in
preparing survival data for analysis using Stata’s st analysis
commands. Chapter 8 covers nonparametric methods, including the
Kaplan–Meier and Nelson–Aalen estimators and the various nonparametric
tests for the equality of survival experience.
Chapters 9–11 discuss Cox regression and include various examples of
fitting a Cox model, obtaining predictions, interpreting results,
building models, model diagnostics, and regression with survey data.
The next four chapters cover parametric models, which are fit using
Stata’s streg command. These chapters include detailed derivations of
all six parametric models currently supported in Stata and methods for
determining which model is appropriate, as well as information on
stratification, obtaining predictions, and advanced topics such as
frailty models. Chapter 16 is devoted to power and sample-size
calculations for survival studies. The final chapter covers survival
analysis in the presence of competing risks.
Table of contents
List of Tables
List of Figures
Preface to the Third Edition
Preface to the Second Edition
Preface to the Revised Edition
Preface to First Edition
Notation and Typography
1 The problem of survival analysis
- 1.1 Parametric modeling
- 1.2 Semiparametric modeling
- 1.3 Nonparametric analysis
- 1.4 Linking the three approaches
2 Describing the distribution of failure
times
- 2.1 The survivor and hazard functions
- 2.2 The quantile function
- 2.3 Interpreting the cumulative hazard and hazard rate
- 2.3.1 Interpreting the cumulative hazard
- 2.3.2 Interpreting the hazard rate
- 2.4 Means and medians
3 Hazard models
- 3.1 Parametric models
- 3.2 Semiparametric models
- 3.3 Analysis time (time at risk)
4 Censoring and truncation
- 4.1 Censoring
- 4.1.1 Right-censoring
- 4.1.2 Interval-censoring
- 4.1.3 Left-censoring
- 4.2 Truncation
- 4.2.1 Left-truncation (delayed entry)
- 4.2.2 Interval-truncation (gaps)
- 4.2.3 Right-truncation
5 Recording survival data
- 5.1 The desired format
- 5.2 Other formats
- 5.3 Example: Wide-form snapshot data
6 Using stset
- 6.1 A short lesson on dates
- 6.2 The purpose of the stset command
- 6.3 The syntax of the stset command
- 6.3.1 Specifying analysis time
- 6.3.2 Variables defined by stset
- 6.3.3 Specifying what constitutes failure
- 6.3.4 Specifying when subjects exit from the analysis
- 6.3.5 Specifying when subjects enter the analysis
- 6.3.6 Specifying the subject-ID variable
- 6.3.7 Specifying the begin-of-span variable
- 6.3.8 Convenience options
7 After stset
- 7.1 Look at stset's output
- 7.2 List some of your data
- 7.3 Use stdescribe
- 7.4 Use stvary
- 7.5 Perhaps use stfill
- 7.6 Example: Hip fracture data
8 Nonparametric analysis
- 8.1 Inadequacies of standard univariate methods
- 8.2 The Kaplan–Meier estimator
- 8.2.1 Calculation
- 8.2.2 Censoring
- 8.2.3 Left-truncation (delayed entry)
- 8.2.4 Interval-truncation (gaps)
- 8.2.5 Relationship to the empirical distribution function
- 8.2.6 Other uses of sts list
- 8.2.7 Graphing the Kaplan–Meier estimate
- 8.3 The Nelson–Aalen estimator
- 8.4 Estimating the hazard function
- 8.5 Estimating mean and median survival times
- 8.6 Tests of hypothesis
- 8.6.1 The log-rank test
- 8.6.2 The Wilcoxon test
- 8.6.3 Other tests
- 8.6.4 Stratified tests
9 The Cox proportional hazards model
- 9.1 Using stcox
- 9.1.1 The Cox model has no intercept
- 9.1.2 Interpreting coefficients
- 9.1.3 The effect of units on coefficients
- 9.1.4 Estimating the baseline cumulative hazard and survivor functions
- 9.1.5 Estimating the baseline hazard function
- 9.1.6 The effect of units on the baseline functions
- 9.2 Likelihood calculations
- 9.2.1 No tied failures
- 9.2.2 Tied failures
- The marginal calculation
- The partial calculation
- The Breslow approximation
- The Efron approximation
- 9.2.3 Summary
- 9.3 Stratified analysis
- 9.3.1 Obtaining coefficient estimates
- 9.3.2 Obtaining estimates of baseline functions
- 9.4 Coxed Models with shared frailty
- 9.4.1 Parameter Estimation
- 9.4.2 Obtaining Estimates of baseline functions
- 9.5 Cox models with survey data
- 9.5.1 Declaring survey characteristics
- 9.5.2 Fitting a Cox model with survey data
- 9.5.3 Some caveats of analyzing survival data from complex survey designs
- 9.6 Cox model with missing data–multiple imputation
9.6.1 Imputing missing values
9.6.2 Multiple-imputation inference
10 Model building using stcox
- 10.1 Indicator variables
- 10.2 Categorical variables
- 10.3 Continuous variables
- 10.3.1 Fractional polynomials
- 10.4 Interactions
- 10.5 Time-varying variables
- 10.5.1 Using stcox, tvc() texp()
- 10.5.2 Using stsplit
- 10.6 Modeling group effects: fixed-effects, random-effects, stratification, and clustering
11 The Cox model: Diagnostics
- 11.1 Testing the proportional hazards assumption
- 11.1.1 Tests based on re-estimation
- 11.1.2 Test based on Schoenfeld residuals
- 11.1.3 Graphical methods
- 11.2 Residuals and diagnostic measures
- Reye's syndrome data
- 11.2.1 Determining functional form
- 11.2.2 Goodness of fit
- 11.2.3 Outliers and influential points
12 Parametric models
- 12.1 Motivation
- 12.2 Classes of parametric models
- 12.2.1 Parametric proportional hazards models
- 12.2.2 Accelerated failure time-models
- 12.2.3 Comparing the two parameterizations
13 A survey of parametric regression models in
Stata
- 13.1 The exponential model
- 13.1.1 Exponential regression in the PH metric
- 13.1.2 Exponential regression in the AFT metric
- 13.2 Weibull regression
- 13.2.1 Weibull regression in the PH metric
- Fitting null models
- 13.2.2 Weibull regression in the AFT metric
- 13.3 Gompertz regression (PH metric)
- 13.4 Lognormal regression (AFT metric)
- 13.5 Loglogistic regression (AFT metric)
- 13.6 Generalized gamma regression (AFT metric)
- 13.7 Choosing among parametric models
- 13.7.1 Nested models
- 13.7.2 Nonnested models
14 Postestimation commands for parametric
models
- 14.1 Use of predict after streg
- 14.1.1 Predicting the time of failure
- 14.1.2 Predicting the hazard and related functions
- 14.1.3 Calculating residuals
- 14.2 Using stcurve
15 Generalizing the parametric regression model
- 15.1 Using the ancillary() option
- 15.2 Stratified models
- 15.3 Frailty models
- 15.3.1 Unshared frailty models
- 15.3.2 Example: Kidney data
- 15.3.3 Testing for heterogeneity
- 15.3.4 Shared frailty models
16 Power and sample-size determination for survival analysis
- 16.1 Estimating sample size
- 16.1.1 Multiple-myeloma data
- 16.1.2 Comparing two survivor functions nonparametrically
- 16.1.3 Comparing two exponential survivor functions
- 16.1.4 Cox regression models
- 16.2 Accounting for withdrawal and accrual of subjects
- 16.2.1 The effect of withdrawal or loss to follow-up
- 16.2.2 The effect of accrual
- 16.2.3 Examples
- 16.3 Estimating power and effect size
- 16.4 Tabulating or graphing results
- 17 Competing risks
17.1 Cause-specific hazards
17.2 Cumulative incidence functions
17.3 Nonparametric analysis
17.3.1 Breast cancer data
17.3.2 Cause-specific hazards
17.3.3 Cumulative incidence functions
17.4 Semiparametric analysis
17.4.1 Cause-specific hazards
Simultaneous regressions for cause-specific hazards
17.4.2 Cumulative incidence functions
Using stcrreg
Using stcox
17.5 Parametric analysis
References
Author Index
Subject Index


|