Data Analysis Using Stata, Third Edition has been completely revamped to reflect the capabilities of Stata 12. This book will appeal to those just learning statistics and Stata, as well as to the many users who are switching to Stata from other packages. Throughout the book, Kohler and Kreuter show examples using data from the German Socio-Economic Panel, a large survey of households containing demographic, income, employment, and other key information.
Kohler and Kreuter take a hands-on approach, first showing how to use Stata’s graphical interface and then describing Stata’s syntax. The core of the book covers all aspects of social science research, including data manipulation, production of tables and graphs, linear regression analysis, and logistic modeling. The authors describe Stata’s handling of categorical covariates and show how the new margins and marginsplot commands greatly simplify the interpretation of regression and logistic results. An entirely new chapter discusses aspects of statistical inference, including random samples, complex survey samples, nonresponse, and causal inference.
The rest of the book includes chapters on reading text files into Stata, writing programs and do-files, and using Internet resources such as the search command and the SSC archive.
Data Analysis Using Stata, Third Edition has been structured so that it can be used as a self-study course or as a textbook in an introductory data analysis or statistics course. It will appeal to students and academic researchers in all the social sciences.
List of Tables
List of Figures
Preface
1. “THE FIRST TIME”
Starting Stata
Setting up your screen
Your first analysis
Inputting commands
Files and the working memory
Loading data
Variables and observations
Looking at data
Interrupting a command and repeating a command
The variable list
The in qualifier
Summary statistics
The if qualifier
Define missing values
The by prefix
Command options
Frequency tables
Graphs
Getting help
Recoding of variables
Variable labels and value labels
Linear regression
Do-files
Exiting Stata
Exercises
2. WORKING WITH DO-FILES
From interactive work to working with a do-file
Alternative 1
Alternative 2
Designing do-files
Comments
Line breaks
Some crucial commands
Organizing your work
Exercises
3. THE GRAMMAR OF STATA
The elements of Stata commands
Stata commands
The variable list
List of variables: required or optionals
Abbreviation rules
Special listings
Options
The in qualifier
The if qualifier
Expressions
Operators
Functions
Lists of numbers
Using filenames
Repeating similar commands
The by prefix
The foreach loop
The types of foreach lists
Several commands within a foreach loop
The forvalues loop
Weights
Frequency weights
Analytic weights
Probability weights
Exercises
4. GENERAL COMMENTS ON THE STATISTICAL COMMANDS
Regular statistical commands
Estimation commands
Exercises
5. CREATING AND CHANGING VARIABLES
The commands generate and replace
Variable names
Some examples
Useful functions
Changing codes with by, n, and N
Subscripts
Specialized recoding commands
The recode command
The egen command
Recording string variables
Recording date and time
Dates
Time
Setting missing values
Labels
Storage types, or, the ghost in the machine
Exercises
6. CREATING AND CHANGING GRAPHS
A primer on graph syntax
Graph types
Examples
Specialized graphs
Graph elements
Appearance of data
Choice of marker
Marker colors
Marker size
Lines
Graphs and plot regions
Graph size
Plot region
Scaling the axes
Information inside the plot region
Reference lines
Labeling inside the plot region
Information outside the plot region
Labeling the axes
Tick lines
Axis titles
The legend
Graph titles
Multiple graphs
Overlaying numerous twoway graphs
Option by()
Combining graphs
Saving and printing graphs
Exercises
7. DESCRIBING AND COMPARING DISTRIBUTIONS
Categories: Few or many?
Variables with few categories
Tables
Frequency tables
More than one frequency table
Comparing distributions
Summary statistics
More than one contingency table
Graphs
Histograms
Bar charts
Bar charts
Dot chart
Variables with many categories
Frequencies of grouped data
Some remarks on grouping data
Special techniques for grouping data
Describing data using statistics
Important summary statistics
The summarize command
The tabstat command
Comparing distributions using statistics
Graphs
Box plots
Histograms
Kernel density estimation
Quantile plot
Comparing distributions with Q–Q plots
Exercises
8. STATISTICAL INFERENCE
Random samples and sampling distributions
Random numbers
Creating fictitious datasets
Drawing random samples
The sampling distribution
Descriptive inference
Standard errors for simple random samples
Standard errors for complex samples
Typical forms of complex samples
Sampling distributions for complex samples
Using Stata’s svy commands
Standard errors with nonresponse
Unit nonresponse and poststratification weights
Item nonresponse and multiple imputation
Uses of standard errors
Confidence intervals
Significance tests
Two-group mean comparison test
Causal inference
Basic concepts
Data-generating processes
Counterfactual concept of causality
The effect of third-class tickets
Some problems of causal inference
Exercises
9. INTRODUCTION TO LINEAR REGRESSION
Simple linear regression
The basic principle
Linear regression using Stata
The table of coefficients
Standard errors
The table of ANOVA results
The model fit table
Multiple regression
Multiple regression using Stata
Additional components
Adjusted R2
Standardized regression coefficients
What does “under control” mean?
Regression diagnostics
Violation of E(?i) = 0
Linearity
Influential cases
Omitted variables
Multicollinearity
Violation of Var(?i) = ?2
Violation of Cov(?i, ?j) = 0, i ? j
Model extensions
Categorical independent variables
Interaction terms
Regression models using transformed variables
Nonlinear relations
Eliminating heteroskedasticity
Reporting regression results
Tables of similar regression models
Plots of coefficients
Conditional-effects plots
Advanced techniques
Median regression
Regression models for panel data
From wide to long format
Fixed-effects models
Error-component models
Exercises
10. REGRESSION MODELS FOR CATEGORICAL DEPENDENT VARIABLES
The linear probability model
Basic concepts
Odds, log odds, and odds ratios
Excursion: The maximum likelihood principle
Logistic regression with Stata
The coefficients table
Sign interpretation
Interpretation with odds ratios
Probability interpretation
Average marginal effects
The iteration block
The model fit block
Classification tables
Pearson chi-squared
Logistic regression diagnostics
Linearity
Influential cases
Likelihood-ratio test
Refined models
Nonlinear relationships
Interaction effects
Advanced techniques
Probit models
Multinomial logistic regression
Models for ordinal data
Exercises
11. READING AND WRITING DATA
The goal: The data matrix
Importing machine-readable data
Reading system files from other packages
Reading Excel files
Reading SAS transport files
Reading other system files
Reading ASCII text files
Reading data in spreadsheet format
Reading data in free format
Reading data in fixed format
Inputting data
Input data using the editor
The input command
Combining data
The GSOEP database
The merge command
Merge 1:1 matches with rectangular data
Merge 1:1 matches with nonrectangular data
Merging more than two files
Merging m:1 and 1:m matches
The append command
Saving and exporting data
Handling lage datasets
Rules for handling the working memory
Using oversized datasets
Exercises
12. DO-FILES FOR ADVANCED USERS AND USERS-WRITTEN PROGRAMS
Two examples of usage
Four programming tools
Local macros
Calculating with local macros
Combining local macros
Changing local macros
Do-files
Programs
The problem of redefinition
The problem of naming
The problem of error checking
Programs in do-files and ado-files
User-written Stata commands
Sketch of the syntax
Create a first ado-file
Parsing variable lists
Parsing options
Parsing if and in qualifiers
Generating an unknown number of variables
Default values
Extended macro functions
Avoiding changes in the dataset
Help files
Exercises
13. AROUND STATA
Resources and information
Taking care of Stata
Additional procedures
Stata Journal ado-files
SSC ado-files
Other ado-files
Exercises