This book is a supplement to Principles of Econometrics, 5th Edition by R. Carter Hill, William E. Griffiths and Guay C. Lim (Wiley, 2018), hereinafter POE5. This book is not a substitute for the textbook, nor is it a standalone computer manual. It is a companion to the textbook, showing how to perform the examples in the textbook using Stata Release 15. This book will be useful to students taking econometrics, as well as their instructors, and others who wish to use Stata for econometric analysis.
CHAPTER 1 INTRODUCING STATA
Starting Stata
The opening display
Exiting Stata
Stata data files for POE5
A working directory
Opening Stata data files
Using the toolbar
The use command
Using files on the internet
Locating book files on the internet
The variables window
Using the data utility for a single label
Describing data and obtaining summary statistics 9
The Stata help system 12
Using keyword search
Opening a dialog box
Complete documentation in Stata manuals
Advice
Stata videos on YouTube
Statalist
Not elsewhere classified
Stata command syntax
Syntax of summarize
Learning syntax using the review window
Saving your work
Copying and pasting
Using a log file
Using the data browser
Using Stata graphics
Histograms
Scatter diagrams
Using Stata Do-files
Creating and managing variables
Creating (generating) new variables
Using the expression builder
Dropping or keeping variables and observations
Using arithmetic operators
Using Stata math functions
Using Stata density functions
Cumulative distribution functions
Inverse cumulative distribution functions
Using and displaying scalars
Example of standard normal cdf
Example of t-distribution tail-cdf
Example computing percentile of the standard normal
Example computing percentile of the t-distribution
A scalar dialog box
Using temporary scalars
Chapter 1 Do-file
CHAPTER 2 SIMPLE LINEAR REGRESSION
The food expenditure data
Starting a new problem
Starting a log file
Opening a Stata data file
Browsing and listing the data
Computing summary statistics
Creating a scatter diagram
Enhancing the plot
Regression
Fitted values and residuals
Plotting the fitted regression line
Using Stata to obtain predicted values
Using saved coefficients
Using lincom
Using the margins command
Using incomplete observations
Computing an elasticity
OLS estimator variances and covariance
Estimating the variance of the error term
Viewing estimated variances and covariance
Saving the Stata data file
Estimating nonlinear relationships
A quadratic model
A log-linear model
Regression with indicator variables
Appendix 2A Average marginal effects
Elasticity in a linear relationship
Elasticity in a quadratic relationship
Slope in a log-linear model
Appendix 2B Simulation experiments
Fixed x’s
Random x’s
Chapter 2 Do-file
CHAPTER 3 INTERVAL ESTIMATION AND HYPOTHESIS TESTING
Interval estimates
Critical values from the t-distribution
Creating an interval estimate
Creating an interval estimate using lincom
Hypothesis tests
Right-tail test of significance
Right-tail test of an economic hypothesis
Left-tail test of an economic hypothesis
Two-tail test of an economic hypothesis
Two-tail test of significance
p-values
p-value of a right-tail test
p-value of a left-tail test
p-value for a two-tail test
p-values in Stata output
Testing and estimating linear combinations of parameters
Appendix Graphical tools
Appendix Monte Carlo simulation
Fixed x’s
Random x’s
Chapter 3 Do-file
CHAPTER 4 PREDICTION, GOODNESS-OF-FIT AND MODELING ISSUES
Least squares prediction
Editing the data
Estimate the regression and obtain postestimation results
Creating the prediction interval
Using margins to create the prediction Interval
Measuring goodness-of-fit
Correlations and R2
The effects of scaling and transforming the Data
Reporting regression results
The linear-log functional form
Plotting the fitted linear-log model
Editing graphs
Analyzing the residuals
Residual plots
The Jarque-Bera test
Chi-square distribution critical values
Chi-square distribution p-values
Polynomial models
Estimating and checking the linear relationship
Estimating and checking a cubic equation
Estimating a log-linear yield growth model
Estimating a log-linear wage equation
The log-linear model
Calculating wage predictions
Constructing wage plots
Generalized R2
Prediction intervals in the log-linear model
Prediction intervals in the log-linear model using margins
A log-log model
Chapter 4 Do-file
CHAPTER 5 MULTIPLE LINEAR REGRESSION
The Hamburger Chain Model
Least Squares Estimation
Least squares procedure
Least squares prediction
Rescaling the variables
Estimating the error variance
Measuring the goodness-of-fit
Frisch-Waugh-Lovell
Least Squares Precision
Confidence Intervals
Changing the confidence level
Linear combination of parameters
Hypothesis Tests
Two-sided t-test
One-sided t-test
Testing a linear combination of parameters
Interaction Variables
Polynomial regressors
Using factor variables for interactions
Interactions with other variables
Log-wages and quadratic interactions
Optimal level of advertising
Maximizing wages via experience
Appendix Nonlinear functions of a single parameter
Appendix Nonlinear functions of two parameters
Appendix Least squares estimation with chi-square errors
Appendix Monte Carlo simulation of the delta method
Appendix Bootstrapping
Chapter 5 Do-file
CHAPTER 6 FURTHER INFERENCE IN THE MULTIPLE REGRESSION MODEL
Testing joint hypotheses: The F-test
Testing the significance of the model
Relationship between t- and F-tests
More general F-tests
Large sample tests
Nonlinear hypothesis tests
Stata programs
Nonsample information
Model specification
Omitted variables
Irrelevant variables
Choosing the model
6RESET test for function form
RESET program
Control variables
Prediction-forecast error variance
Prediction-model selection and RMSE
Poor data, collinearity, and insignificance
Variance inflation factors
Influential observations
Nonlinear least squares
Chapter 6 Do-file
CHAPTER 7 USING INDICATOR VARIABLES
Indicator variables
Creating indicator variables
Estimating an indicator variableregression
Testing the significance of the indicator Variables
Further calculations
Computing average marginal effects
Applying indicator variables
Interactions between qualitative factors
Adding regional indicators
Testing the equivalence of two regressions
Estimating separate regressions
Indicator variables in log-linear models
The linear probability model
Treatment effects
Differences-in-Differences estimation
Chapter 7 Do-file
CHAPTER 8 HETEROSKEDASTICITY
The nature of heteroskedasticity
Heteroskedastic-consistent standard errors
The generalized least squares estimator
Feasible GLS-a more general case
Fesible GLS with a heteroskedastic partition
Detecting heteroskedasticity
The Goldfeld-Quandt test using partitioned data
The Goldfeld-Quandt test in the food expenditure model
Lagrange multiplier tests
Heteroskedasticity in the linear probability model
Appendix Alternative robust sandwich estimators
Appendix Monte Carlo evidence
Chapter 8 Do-file
CHAPTER 9 REGRESSION WITH TIME-SERIES DATA: STATIONARY VARIABLES
Introduction
Defining time-series in Stata
Time-series plots
Stata’s lag and difference operators
Correlogram
The AR(2) model
Autoregressive distributed lag models
Forecasts and forecast intervals
Model selection
Granger causality
Serial correlation in residuals
Detecting autocorrelation in residuals
Okun’s Law
HAC standard errors
Nonlinear least squares
Feasible GLS
The consumption function
Multipliers for an IDL model
Durbin-Watson Test
Chapter 9 Do-file
CHAPTER 10 ENDOGENOUS REGRESSORS AND MOMENT BASED ESTIMATION
Least squares estimation of a wage equation
Two-stage least squares
IV estimation with surplus instruments
Illustrating partial correlations
The Hausman test for endogeneity
Testing the validity of surplus instruments
Testing for weak instruments
Calculating the Cragg-Donald F-statistic
Illustrations using simulated data
A simulation experiment
Chapter 10 Do-file
CHAPTER 11 SIMULTANEOUS EQUATIONS MODELS
Key Terms
Truffle supply and demand
Estimating the reduced form equations
2SLS estimates of truffle demand
2SLS estimates of truffle supply
Supply and demand of fish
Reduced forms for fish price and quantity
2SLS estimates of fish demand
2SLS alternatives
Monte Carlo simulation
Chapter 11 Do-file 495
CHAPTER 12 REGRESSION WITH TIME-SERIES DATA: NONSTATIONARY VARIABLES
Key Terms
Stationary and nonstationary data
Review: generating dates in Stata
Extracting dates
Graphing the data
Summary statistics using subsamples
Correlogram
Deterministic trends
Spurious regressions
Unit root tests for stationarity
Is GDP trend stationary?
Is wheat yield stationary?
Integration and cointegration
Order of integration
Engle-Granger test
The error correction model
Regression with no cointegration
Chapter 12 Do-file
CHAPTER 13 VECTOR ERROR CORRECTION AND VECTOR AUTOREGRESSIVE MODELS
VEC and VAR models
Estimating a VEC model
Estimating a VAR
Impulse responses and variance decompositions
Chapter 13 do-file
CHAPTER 14 TIME-VARYING VOLATILITY AND ARCH MODELS
Key Terms
ARCH model and time-varying volatility
Simulating ARCH
Testing, estimating and forecasting
Extensions
GARCH
Threshold GARCH
GARCH-in-mean
Chapter 14 Do-file
CHAPTER 15 PANEL DATA MODELS
Key Terms
A microeconomic panel
The fixed-effects estimator
The difference estimator: T = 2
The within estimator: T = 2
The within estimator: T = 3
The fixed-effects estimator: xtreg
The least squares dummy variable estimator
Testing for fixed effects
Panel data regression error assumptions
OLS estimation with cluster-robust standard errors
Fixed-effects estimation with cluster-robust standard errors
Random-effects estimation of a production function
Random-effects estimation of a wage equation
Testing for random-effects
The Hausman contast test for the production function
The Hausman contast test for the wage equation
A regression based Hausman test for the production function
A regression based Hausman test for the wage equation
The Hausman-Taylor estimator
Chapter 15 Do-file
CHAPTER 16 QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS
Key Terms
Models with binary dependent variables
The linear probability model
Probit: a small example
Probit: the transportation data
Marginal effects
Probit marginal effects: details
Standard error of average marginal effect
The logit model for binary choice
Wald tests
Likelihood ratio tests
Binary choice models with a continuous endogenous variable
Multinomial logit
Conditional logit
Estimation using asclogit
Ordered choice models
Models for count data
Censored data models
Selection bias
Appendix 16D Tobit Monte Carlo experiment
Chapter 16 Do-file
APPENDIX A REVIEW OF MATH ESSENTIALS
Key Terms
Stata math and logical operators
Math functions
Extensions to generate
The calculator
Scientific notation
Logarithms
Numerical derivatives and integrals
Appendix A Do-file
APPENDIX B REVIEW OF PROBABILITY
Key Terms
Stata probability functions
Binomial distribution
Poisson distribution
Normal distribution
Normal density plots
Normal probability calculations
Chi-square distribution
Plotting the chi-square density
Chi-square probability calculations
The non-centaral chi-square pdf
Student’s t-distribution
Plot of standard normal and t(3)
t-distribution probabilities
Graphing tail probabilities
The non-central t-distribution
F-distribution
Plotting the F-density
F-distribution probabililty calculations
The non-central F-distribution
The log-normal distribution
Random numbers
Using inversion method
Creating uniform random numbers
Appendix B Do-file
APPENDIX C REVIEW OF STATISTICAL INFERENCE
Key Terms
Examining the hip data
Constructing a histogram
Obtaining summary statistics
Estimating the population mean
Using simulated data values
The central limit theorem
Estimating population moments
Interval estimation
Computing confidence intervals
Using simulated data
Using the hip data
Testing the mean of a normal population
Right-tail test
Two-tail test
Testing the variance of a normal population
Testing the equality of two normal population means
Population variances are equal
Population variances are unequal
Testing the equality of two normal population variances
Testing normality
Maximum likelihood estimation
Testing a population proportion
Likelihood ratio test
Wald test
Lagrange multiplier test
Least squares
Kernel density estimator
Appendix C Do-file