A Gentle Introduction to Stata, Fourth Edition
by Alan A. Acock
Alan
C. Acock’s A Gentle Introduction to Stata, Fourth Edition is aimed at
new Stata users who want to become proficient in Stata. After reading
this introductory text, new users will not only be able to use Stata
well but will also learn new aspects of Stata. Acock assumes that the
user is not familiar with any statistical software. This assumption of
a blank slate is central to the structure and contents of the book.
Acock starts with the basics; for example, the portion of the book that
deals with data management begins with a careful and detailed example
of turning survey data on paper into a Stata-ready dataset on the
computer. When explaining how to go about basic exploratory statistical
procedures, Acock includes notes that will help the reader develop good
work habits. This mixture of explaining good Stata habits and good
statistical habits continues throughout the book.
Acock is quite careful to teach the reader all aspects of using Stata.
He covers data management, good work habits (including the use of basic
do-files), basic exploratory statistics (including graphical displays),
and analyses using the standard array of basic statistical tools
(correlation, linear and logistic regression, and parametric and
nonparametric tests of location and dispersion). He also successfully
introduces some more advanced topics such as multiple imputation and
structural equation modeling in a very approachable manner. Acock
teaches Stata commands by using the menus and dialog boxes while still
stressing the value of do-files. In this way, he ensures that all types
of users can build good work habits. Each chapter has exercises that
the motivated reader can use to reinforce the material.
The tone of the book is friendly and conversational without ever being
glib or condescending. Important asides and notes about terminology are
set off in boxes, which makes the text easy to read without any
convoluted twists or forward referencing. Rather than splitting topics
by their Stata implementation, Acock arranges the topics as they would
appear in a basic statistics textbook; graphics and postestimation are
woven into the material in a natural fashion. Real datasets, such as
the General Social Surveys from 2002 and 2006, are used throughout the
book.
The focus of the book is especially helpful for those in psychology and
the social sciences because the presentation of basic statistical
modeling is supplemented with discussions of effect sizes and
standardized coefficients. Various selection criteria, such as
semipartial correlations, are discussed for model selection. Acock also
covers a variety of commands available for evaluating reliability and
validity of measurements.
The fourth edition has been updated to include new features in Stata
13. Effect-size computation is performed using the esize and estat
esize commands. Power and sample-size analysis for two-sample tests of
means, as well as one-way, two-way, and repeated measures ANOVA models,
is demonstrated using the power suite of commands. The multiple
regression chapter includes a new section on modeling quadratic
relationships. The chapter on logistic regression contains new material
on examining effects of predictors using margins and marginsplot. A
newly added chapter is devoted to Stata’s sem and gsem commands for
structural equation modeling. This chapter focuses on fitting linear
and logistic regression models, thinking of these models in terms of
path diagrams, and expanding the capabilities of regress and logistic
using sem and gsem. After covering models with one response variable,
Acock extends these concepts to performing path analysis.
Table of Contents
List of figures
List of tables
List of boxed tips
Preface (pdf)
Support materials for the book
1 Getting started
1.1 Conventions
1.2 Introduction
1.3 The Stata screen
1.4 Using an existing dataset
1.5 An example of a short Stata session
1.6 Summary
1.7 Exercises
2 Entering data
2.1 Creating a dataset
2.2 An example questionnaire
2.3 Develop a coding system
2.4 Entering data using the Data Editor
2.4.1 Value labels
2.5 The Variables Manager
2.6 The Data Editor (Browse) view
2.7 Saving your dataset
2.8 Checking the data
2.9 Summary
2.10 Exercises
3 Preparing data for analysis
3.1 Introduction
3.2 Planning your work
3.3 Creating value labels
3.4 Reverse-code variables
3.5 Creating and modifying variables
3.6 Creating scales
3.7 Save some of your data
3.8 Summary
3.9 Exercises
4 Working with commands, do-files, and results
4.1 Introduction
4.2 How Stata commands are constructed
4.3 Creating a do-file
4.4 Copying your results to a word processor
4.5 Logging your command file
4.6 Summary
4.7 Exercises
5 Descriptive statistics and graphs for one variable
5.1 Descriptive statistics and graphs
5.2 Where is the center of a distribution?
5.3 How dispersed is the distribution?
5.4 Statistics and graphs—unordered categories
5.5 Statistics and graphs—ordered categories and variables
5.6 Statistics and graphs—quantitative variables
5.7 Summary
5.8 Exercises
6 Statistics and graphs for two categorical variables
6.1 Relationship between categorical variables
6.2 Cross-tabulation
6.3 Chi-squared test
6.3.1 Degrees of freedom
6.3.2 Probability tables
6.4 Percentages and measures of association
6.5 Odds ratios when dependent variable has two categories
6.6 Ordered categorical variables
6.7 Interactive tables
6.8 Tables—linking categorical and quantitative variables
6.9 Power analysis when using a chi-squared test of significance
6.10 Summary
6.11 Exercises
7 Tests for one or two means
7.1 Introduction to tests for one or two means
7.2 Randomization
7.3 Random sampling
7.4 Hypotheses
7.5 One-sample test of a proportion
7.6 Two-sample test of a proportion
7.7 One-sample test of means
7.8 Two-sample test of group means
7.8.1 Testing for unequal variances
7.9 Repeated-measures t test
7.10 Power analysis
7.11 Nonparametric alternatives
7.11.1 Mann–Whitney two-sample rank-sum test
7.11.2 Nonparametric alternative: Median test
7.12 Summary
7.13 Exercises
8 Bivariate correlation and regression
8.1 Introduction to bivariate correlation and regression
8.2 Scattergrams
8.3 Plotting the regression line
8.4 An alternative to producing a scattergram, binscatter
8.5 Correlation
8.6 Regression
8.7 Spearman’s rho: Rank-order correlation for ordinal data
8.8 Summary
8.9 Exercises
9 Analysis of variance
9.1 The logic of one-way analysis of variance
9.2 ANOVA example
9.3 ANOVA example using survey data
9.4 A nonparametric alternative to ANOVA
9.5 Analysis of covariance
9.6 Two-way ANOVA
9.7 Repeated-measures design
9.8 Intraclass correlation—measuring agreement
9.9 Power analysis with ANOVA
9.9.1 One-way ANOVA
Power analysis for two-way ANOVA
9.9.2 Power analysis for repeated-measures ANOVA
9.9.3 Summary of power analysis for ANOVA
9.10 Summary
9.11 Exercises
10 Multiple regression
10.1 Introduction to multiple regression
10.2 What is multiple regression?
10.3 The basic multiple regression command
10.4 Increment in R-squared: Semipartial correlations
10.5 Is the dependent variable normally distributed?
10.6 Are the residuals normally distributed?
10.7 Regression diagnostic statistics
10.7.1 Outliers and influential cases
10.7.2 Influential observations: DFbeta
10.7.3 Combinations of variables may cause problems
10.8 Weighted data
10.9 Categorical predictors and hierarchical regression
10.10 A shortcut for working with a categorical variable
10.11 Fundamentals of interaction
10.12 Nonlinear relations
10.12.1 Fitting a quadratic model
10.12.2 Centering when using a quadratic term
10.12.3 Do we need to add a quadratic component?
10.13 Power analysis in multiple regression
10.14 Summary
10.15 Exercises
11 Logistic regression
11.1 Introduction to logistic regression
11.2 An example
11.3 What is an odds ratio and a logit?
11.3.1 The odds ratio
11.3.2 The logit transformation
11.4 Data used in rest of chapter
11.5 Logistic regression
11.6 Hypothesis testing
11.6.1 Testing individual coefficients
11.6.2 Testing sets of coefficients
11.7 More on interpreting results from logistic regression
11.8 Nested logistic regressions
11.9 Power analysis when doing logistic regression
11.10 Summary
11.11 Exercises
12 Measurement, reliability, and validity
12.1 Overview of reliability and validity
12.2 Constructing a scale
12.2.1 Generating a mean score for each person
12.3 Reliability
12.3.1 Stability and test–retest reliability
12.3.2 Equivalence
12.3.3 Split-half and alpha reliability—internal consistency
12.3.4 Kuder–Richardson reliability for dichotomous items
12.3.5 Rater agreement—kappa (K)
12.4 Validity
12.4.1 Expert judgment
12.4.2 Criterion-related validity
12.4.3 Construct validity
12.5 Factor analysis
12.6 PCF analysis
12.6.1 Orthogonal rotation: Varimax
12.6.2 Oblique rotation: Promax
12.7 But we wanted one scale, not four scales
12.7.1 Scoring our variable
12.8 Summary
12.9 Exercises
13 Working with missing values—multiple imputation
13.1 The nature of the problem
13.2 Multiple imputation and its assumptions about the mechanism for missingness
13.3 What variables do we include when doing imputations?
13.4 Multiple imputation
13.5 A detailed example
13.5.1 Preliminary analysis
13.5.2 Setup and multiple-imputation stage
13.5.3 The analysis stage
13.5.4 For those who want an R2 and standardized βs
13.5.5 When impossible values are imputed
13.6 Summary
13.7 Exercises
14 The sem and gsem commands
14.1 Ordinary least-squares regression models using sem
14.1.1 Using the SEM Builder to fit a basic regression model
14.2 A quick way to draw a regression model and a fresh start
14.2.1 Using sem without the SEM Builder
14.3 WThe gsem command for logistic regression
14.3.1 Fitting the model using the logit command
14.3.2 Fitting the model using the gsem command
14.4 Path analysis and mediation
14.5 Conclusions and what is next for the sem command
14.6 Exercises
A What’s next?
A.1 Introduction to the appendix
A.2 Resources
A.2.1 Web resources
A.2.2 Books about Stata
A.2.3 Short courses
A.2.4 Acquiring data
A.3 Summary
References
Author index (pdf)
Subject index(pdf)
©Copyright StataCorp LP 2002-2015
|