GOALS
This tutorial explores the estimation of a linear model. After this tutorial you should be able to
- Create linear data using the GAUSS random normal number generator and GAUSS matrix operations.
- Estimate the linear model using matrix operations.
- Estimate the linear model using the
ols
procedure.
INTRODUCTION
The linear regression model is one of the fundamental workhorses of econometrics and is used to model a wide variety of economic relationships. The general model assumes a linear relationship between a dependent variable, y
, and one or more independent variables, x
.
Y = α + βX
α = intercept
β = slope
Note: The graph above was created using the plotScatter
procedure. For more information on plotting data see our graph basics tutorial.
GENERATE LINEAR DATA
To create our linear data we use a simple univariate linear data generating process
where ϵi is the random disturbance term. To generate our data we break the process into three simple steps:
GENERATE INDEPENDENT DATA
Our first step is to generate a vector with 100 random x
values drawn from N(0,1) using the GAUSS command rndn
.
// Clear the workspace new; // Set seed to replicate results rndseed 23423; // Number of observations num_obs = 100; // Generate x ~ N(0,1), with // 'num_obs' rows and 1 column x = rndn(num_obs,1);
GENERATE THE ERROR TERM
We will use the same function, rndn
, to generate the random disturbances.
// Compute 100 observations of an error term ~ N(0,1) error_term = rndn(num_obs,1);
GENERATE THE DEPENDENT DATA
Finally, generate y
from x
and error_term
following the data generating process above.
// Simulate our dependent variable y = 1.3 + 5.7*x + error_term;
Note: The multiplication operator in GAUSS is overloaded to compute matrix multiplication if both inputs are matrices or vectors, and to compute element-by-element (ExE) multiplication if one of the operands is a scalar. Use the dot multiplication operator .*, to compute ExE multiplication of matrices, vectors or multi-dimensional arrays.
ESTIMATE THE MODEL USING MATRIX OPERATIONS
Using x
and y
, we can estimate the model parameters and compare our estimates to the true parameters. In order to estimate the constant in our model, we must add a vector of ones to the x
matrix. This is easily done using the function ones
in GAUSS.
Note: The tilde operator, ~, horizontally concatenates two matrices or vectors into one larger matrix.
// Create a new (num_obs x 2) matrix, 'x_mat', where // the first column is all ones x_mat = ones(num_obs, 1) ~ x;
We can now estimate the two parameters of the model, the constant and slope coefficient, using our x
matrix and y
vector :
© 2024 Aptech Systems, Inc. All rights reserved.
//Compute OLS estimates, using matrix operations beta_hat = inv(x_mat'x_mat)*(x_mat'y); print beta_hat;
The above print
statement should return the following output:
1.2795490 5.7218389
ESTIMATE THE MODEL USING ols
FUNCTION
Above we used matrix operations to calculate the parameters of the model. However, GAUSS includes a built-in function ols
which will perform the same estimation for us. The function will find the parameter models and provide several model diagnostics. ols
takes the following three inputs:
- dataset
- String, name dataset to use for regression. Use an empty string,
""
, ifx
andy
are matrices
- String, name dataset to use for regression. Use an empty string,
- x
- Matrix of independent variable for regression, or string with variable names
- y
- Vector of dependent variables, or string with name of independent variable
In our case, we will use the x
and y
data which we created. As these are not related to any dataset, we can enter a blank string, ""
for the dataset input. Furthermore, rather than sending the output to any variables, we will simply print the output to screen using the command call
.
call ols("", y, x);
Which should print a report similar to the following:
Valid cases: 100 Dependent variable: Y Missing cases: 0 Deletion method: None Total SS: 3481.056 Degrees of freedom: 98 R-squared: 0.969 Rbar-squared: 0.969 Residual SS: 107.205 Std error of est: 1.046 F(1,98): 3084.149 Probability of F: 0.000 Standard Prob Standardized Cor with Variable Estimate Error t-value >|t| Estimate Dep Var ----------------------------------------------------------------------------- CONSTANT 1.27955 0.105343 12.14651 0.000 --- --- X1 5.72184 0.103031 55.53512 0.000 0.984481 0.984481
CONCLUSION
Congratulations! You have:
- Used matrix operations to generate linear data.
- Estimated an OLS model using matrix operations.
- Estimates an OLS model using
ols
.
For convenience, the full program text is reproduced below.
The next tutorial describes using the estimated parameters to predict outcomes and compute residuals.
// Clear the workspace new; // Set seed to replicate results rndseed 23423; // Number of observations num_obs = 100; // Generate x ~ N(0,1), with // 'num_obs' rows and 1 column x = rndn(num_obs,1); // Compute 100 observations of an error term ~ N(0,1) error_term = rndn(num_obs,1); // Simulate our dependent variable y = 1.3 + 5.7*x + error_term; // Create a new (num_obs x 2) matrix, 'x_mat', where // the first column is all ones x_mat = ones(num_obs, 1) ~ x; // Compute OLS estimates, using matrix operations beta_hat = inv(x_mat'x_mat)*(x_mat'y); print beta_hat; call ols("", y, x);