Michael Mitchell’s Interpreting and Visualizing Regression Models Using Stata, Second Edition is a clear treatment of how to carefully present results from model-fitting in a wide variety of settings. It is a boon to anyone who has to present the tangible meaning of a complex model clearly, regardless of the audience. As an example, many experienced researchers start to squirm when asked to give a simple explanation of the practical meaning of interactions in nonlinear models such as logistic regression. The techniques presented in Mitchell’s book make answering those questions easy. The overarching theme of the book is that graphs make interpreting even the most complicated models containing interaction terms, categorical variables, and other intricacies straightforward.
Using a dataset based on the General Social Survey, Mitchell starts with a basic linear regression with a single independent variable and then illustrates how to tabulate and graph predicted values. Mitchell focuses on Stata’s margins and marginsplot commands, which play a central role in the book and which greatly simplify the calculation and presentation of results from regression models. In particular, through use of the marginsplot command, he shows how you can graphically visualize every model presented in the book and thus gain insight into results much easier when you can view them in a graph rather than in a mundane table of results.
Mitchell then proceeds to more complicated models where the effects of the independent variables are nonlinear. After discussing how to detect nonlinear effects, he presents examples using both standard polynomial models, where independent variables can be raised to powers like -1 or 1/2. In all cases, Mitchell again uses the marginsplot command to illustrate the effect that changing an independent variable has on the dependent variable. Piecewise linear models are presented as well; these are linear models in which the slope or intercept is allowed to change depending on the range of an independent variable. He also uses the contrast command when discussing categorical variables; as the name suggests, this command allows you to easily contrast predictions made for various levels of the categorical variable.
Interaction terms can be tricky to interpret, but Mitchell shows how graphs produced by marginsplot greatly clarify results. Individual chapters are devoted to two- and three-way interactions containing all continuous or all categorical variables and include many practical examples. Raw regression output including interactions of continuous and categorical variables can be nearly impossible to interpret, but again Mitchell makes this a snap through judicious use of the margins and marginsplot commands in subsequent chapters.
The first two-thirds of the book is devoted to cross-sectional data, while the final third considers longitudinal data and complex survey data. A significant difference between this book and most others on regression models is that Mitchell spends quite some time on fitting and visualizing discontinuous models–models where the outcome can change value suddenly at thresholds. Such models are natural in settings such as education and policy evaluation, where graduation or policy changes can make sudden changes in income or revenue.
The second edition has been updated to incorporate many new features added since Stata 12, when the first edition was written. Specifically, the text now demonstrates how labels on the values of categorical variables make interpretation much easier when looking at regression results and results from the margins and contrast commands. For instance, you now see that your coefficients or marginal means are related to the “low-dose” and “high-dose” groups instead of groups 1 and 2. In addition, Mitchell now shows you how to customize output from estimation commands, margins, and contrast for even more clarity. In his discussion of customizing graphs produced by marginsplot, he demonstrates new graph features such as the use of transparency. He also includes new examples of multilevel models for longitudinal data that take advantage of the degree-of-freedom adjustments for small sample sizes that are now provided by mixed and contrast.
This book is a worthwhile addition to the library of anyone involved in statistical consulting, teaching, or collaborative applied statistical environments. Graphs greatly aid the interpretation of regression models, and Mitchell’s book shows you how.
List of Tables
List of Figures
Preface to the Second Edition
Acknowledgements
INTRODUCTION
Read me first
The GSS dataset
Age
Education
Gender
The optimism datasets
The school datasets
The sleep datasets
Overview of the book
I CONTINUOS PREDICTORS
CONTINUOS PREDICTORS: LINEAR
Chapter overview
Simple linear regression
Computing predicted means using the margins command
Graphing predicted means using the marginsplot command
Multiple regression
Computing adjusted means using the margins command
Some technical details about adjusted means
Graphing adjusted means using the marginsplot command
Checking for nonlinearity graphically
Using scatterplots to check for nonlinearity
Checking for nonlinearity using residuals
Checking for nonlinearity using locally weighted smoother
Graphing outcome mean at each level of predictor
Summary
Checking for nonlinearity analytically
Adding power terms
Using factor variables
Summary
CONTINUOS PREDICTORS: POLYNOMIALS
Chapter overview
Quadratic (squared) terms
Overview
Examples
Cubic (third power) terms
Overview
Examples
Fractional polynomial regression
Overview
Example using fractional polynomial regression
Main effects with polynomial terms
Summary
CONTINUOS PREDICTORS: PIECEWISE MODELS
Chapter overview
Introduction to piecewise regression models
Piecewise with one known knot
Overview
Examples using the GSS
Piecewise with two known knots
Overview
Examples using the GSS
Piecewise with one knot and one jump
Overview
Examples using the GSS
Piecewise with two knots and two jumps
Overview
Examples using the GSS
Piecewise with an unknown knot
Piecewise model with multiple unknown knots
Piecewise models and the marginsplot command
Automating graphs of piecewise models
Summary
CONTINUOUS BY CONTINUOUS INTERACTIONS
Chapter overview
Linear by linear interactions
Overview
Example using GSS data
Interpreting the interaction in terms of age
Interpreting the interaction in terms of education
Interpreting the interaction in terms of age slope
Interpreting the interaction in terms of the educ slope
Linear by quadratic interactions
Overview
Example using GSS data
Summary
CONTINUOUS BY CONTINUOUS BY CONTINUOUS INTERACTIONS
Chapter overview
Overview
Examples using the GSS data
A model without a three-way interaction
A three-way interaction model
Summary
II CATEGORICAL PREDICTORS
CATEGORICAL PREDICTORS
Chapter overview
Comparing two groups using a t test
More groups and more predictors
Overview of contrast operators
Compare each group against a reference group
Selecting a specific contrast
Selecting a different reference group
Selecting a contrast and reference group
Compare each group against the grand mean
Selecting a specific contrast
Compare adjacent means
Reverse adjacent contrasts
Selecting a specific contrast
Comparing the mean of subsequent or previous levels
Comparing the mean of previous levels
Selecting a specific contrast
Polynomial contrasts
Custom contrasts
Weighted contrasts
Pairwise comparisons
Interpreting confidence intervals
Testing categorical variables using regression
Summary
CATEGORICAL BY CATEGORICAL INTERACTIONS
Chapter overview
Two by two models: Example 1
Simple effects
Estimating the size of the interaction
More about interaction
Summary
Two by three models
Example 2
Example 3
Summary
Three by three models: Example 4
Simple effects
Simple contrasts
Partial interaction
Interaction contrasts
Summary
Unbalanced designs
Main effects with interactions: anova versus regress
Interpreting confidence intervals
Summary
CATEGORICAL BY CATEGORICAL BY CATEGORICAL INTERACTIONS
Chapter overview
Two by two by two models
Simple interactions by season
Simple interactions by depression status
Simple effects
Two by two by three models
Simple interactions by depression status
Simple partial interaction by depression status
Simple contrasts
Partial interactions
Three by three by three models and beyond
Partial interactions and interaction contrasts
Simple interactions
Simple effects and simple comparisons
Summary
III CONTINUOS AND CATEGORICAL PREDICTORS
LINEAR BY CATEGORICAL INTERACTIONS
Chapter overview
Linear and two-level categorical: No interaction
Overview
Examples using the GSS
Linear by two-level categorical interactions
Overview
Examples using the GSS
Linear by three-level categorical interactions10.4.1 Overview
Overview
Examples using the GSS
Summary
POLYNOMIAL BY CATEGORICAL INTERACTIONS
Chapter overview
Quadratic by categorical interactions
Overview
Quadratic by two-level categorical
Quadratic by three-level categorical
Cubic by categorical interactions
Summary
PIECEWISE BY CATEGORICAL INTERACTIONS
Chapter overview
One knot and one jump
Comparing slopes across gender
Comparing slopes across education
Difference in differences of slopes
Comparing changes in intercepts
Computing and comparing adjusted means
Graphing adjusted means
Two knots and two jumps
Comparing slopes across gender
Comparing slopes across education
Difference in differences of slopes
Comparing changes in intercepts by gender
Comparing changes in intercepts by education
Computing and comparing adjusted means
Graphing adjusted means
Comparing coding schemes
Coding scheme #1
Coding scheme #2
Coding scheme #3
Coding scheme #4
Choosing coding schemes
Summary
CONTINUOUS BY CONTINUOUS BY CATEGORICAL INTERACTIONS
Chapter overview
Linear by linear by categorical interactions
Fitting separate models for males and females
Fitting a combined model for males and females
Interpreting the interaction focusing in the age slope
Interpreting the interaction focusing on the educ slope
Estimating and comparing adjusted means by gender
Linear by quadratic by categorical interactions
Fitting separate models for males and females
Fitting a common model for males and females
Interpreting the interaction
Estimating and comparing adjusted means by gender
Summary
CONTINUOUS BY CATEGORICAL BY CATEGORICAL INTERACTIONS
Chapter overview
Simple effects of gender on the age slope
Simple effects of education on the age slope
Simple contrasts on education for the age slope
Partial interaction on education for the age slope
Summary
IV BEYOND ORDINARY LINEAR REGRESSION
MULTILEVEL MODELS
Chapter overview
Example 1: Continuous by continuous interaction
Example 2: Continuous by categorical interaction
Example 3: Categorical by continuous interaction
Example 4: Categorical by categorical interaction
Summary
TIME AS A CONTINUOUS PREDICTOR
Chapter overview
Example 1: Linear effect of time
Example 2: Linear effect of time by a categorical predictor
Example 3: Piecewise modeling of time
Example 4: Piecewise effects of time by a categorical predictor
Baseline slopes
Change in slopes: Treatment versus baseline
Jump at treatment
Comparisons among groups
Summary
TIME AS A CATEGORICAL PREDICTOR
Chapter overview
Example 1: Time treated as a categorical variable
Example 2: Time (categorical) by two groups
Example 3: Time (categorical) by three groups
Comparing models with different residual covariance structures
Summary
NONLINEAR MODELS
Chapter overview
Binary logistic regression
A logistic model with one categorical predictor
A logistic model with one continuous predictor
A logistic model with covariates
Multinomial logistic regression
Ordinal logistic regression
Poisson regression
More applications of nonlinear models
Categorical by categorical interaction
Categorical by continuous interaction
Piecewise modeling
Summary
COMPLEX SURVEY DATA
V APPENDICES
Specifying the confidence level
Customizing the formatting of columns in the coefficient table
Customizing the display of factor variables
The at() option
Margins with factor variables
Margins with factor variables and the at() option
The dydx() and related options
Specifying the confidence level
Customizing column formatting
Customizing the display of factor variables
Adjustments for multiple comparisons
Specifying the confidence level
Customizing column formatting
THE PW COMPARE COMMAND