The Mata Book: A Book for Serious Programmers and Those Who Want to Be is the book that Stata programmers have been waiting for. Mata is a serious programming language for developing small- and large-scale projects and for adding features to Stata. What makes Mata serious is that it provides structures, classes, and pointers along with matrix capabilities. The book is serious in that it covers those advanced features, and teaches them. The reader is assumed to have programming experience, but only some programming experience. That experience could be with Stata’s ado-language, or with Python, Java, C++, Fortran, or other languages like them. As the book says, “being serious is a matter of attitude, not current skill level or knowledge”.
Acknowledgment
1. INTRODUCTION
Is this book for me?
What is Mata?
What is covered in this book
How to download the files for this book
2. THE MECHANICS OF USING MATA
Introduction
Mata code appearing in do-files
Mata code appearing in ado-files
Mata code to be exposed publicly
3. A PROGRAMMER’S TOUR OF MATA
Preliminaries
Results of expressions are displayed when not stored
Assignment
Multiple assignment
Real, complex, and string values
Real values
Complex values
String values (ASCII, Unicode, and binary)
Scalars, vectors, and matrices
Functions rows(), cols(), and length()
Function I()
Function J()
Row-join and column-join operators
Null vectors and null matrices
Mata’s advanced features
Variable types
Structures
Classes
Pointers
Notes for programmers
How programmers use Mata’s interactive mode
What happens when code has errors
The _error() abort function
4. MATA’S PROGRAMMING STATEMENTS
The structure of Mata programs
The program body
Expressions
Conditional execution statement
Looping statements
while
for
do while
continue and break
goto
return
Functions returning values
Functions returning void
5. MATA’S EXPRESSIONS
More surprises
Numeric and string literals
Numeric literals
Base-10 notation
Base-2 notation
Complex literals
String literals
Assignment operator
Operator precedence
Arithmetic operators
Increment and decrement operators
Logical operators
(Understand this ? skip : read) Ternary conditional operator
Matrix row and column join and range operators
Row and column join
Comma operator is overloaded
Row and column count vectors
Colon operators for vectors and matrices
Vector and matrix subscripting
Element subscripting
List subscripting
Permutation vectors
Use to sort data
Use in advanced mathematical programming
Submatrix subscripting
Pointer and address operators
Cast-to-void operator
6. MATA’S VARIABLE TYPES
Overview
The forty variable types
Default initialization
Default eltype, orgtype, and therefore, variable type
Partial types
A forty-first type for returned values from functions
Appropriate use of transmorphic
Use transmorphic for arguments of overloaded functions
Use transmorphic for output arguments
Use transmorphic for passthru variables
You must declare structures and classes if not passthru
How to declare pointers
7. MATA’S STRICT OPTION AND MATA’S PRAGMAS
Overview
Turning matastrict on and off
The messages that matastrict produces, and suppressing them
8. MATA’S FUNCTION ARGUMENTS
Introduction
Functions can change the contents of the caller’s arguments
How to document arguments that are changed
How to write functions that do not unnecessarily change arguments
How to write functions that allow a varying number of arguments
How to write functions that have multiple syntaxes
9. PROGRAMMING EXAMPLE: N_CHOOSE_K() THREE WAYS
Overview
Developing n_choose_k()
n_choose_k() packaged as a do-file
How I packaged the code: n_choose_k.do
How I could have packaged the code
n_choose_k.mata
test_n_choose_k.do
Certification files
n_choose_k() packaged as an ado-file
Writing Stata code to call Mata functions
nchooseki.ado
test_nchooseki.do
Mata code inside of ado-files is private
n_choose_k() packaged as a Mata library routine
Your approved source directory
make_lmatabook.do
test.do
hello.mata
n_choose_k.mata
test_n_choose_k.do
Building and rebuilding libraries
Deleting libraries
10. MATA’S STRUCTURES
Overview
You must define structures before using them
Structure jargon
Adding variables to structures
Structures containing other structures
Surprising things you can do with structures
Do not omit the word scalar in structure declarations
Structure vectors and matrices and use of the constructor function
Use of transmorphic with structures
Structure pointers
11. PROGRAMMING EXAMPLE: LINEAR REGRESSION
Introduction
Self-threading code
Linear-regression system lr*() version 1
lr*() in action
The calculations to be programmed
lr*() version-1 code listing
Discussion of the lr*() version-1 code
Getting started
Assume subroutines
Learn about Mata’s built-in subroutines
Use of built-in subroutine cross()
Use more subroutines
Linear-regression system lr*() version 2
The deviation from mean formulas
The lr*() version-2 code
lr*() version-2 code listing
Other improvements you could make
Closeout of lr*() version 2
Certification
Adding lr*() to the lmatabook.mlib library
12. MATA’S CLASSES
Overview
Classes contain member variables
Classes contain member functions
Member functions occult external functions
Members—variables and functions—can be private
Classes can inherit from other classes
Privacy versus protection
Subclass functions occult superclass functions
Multiple inheritance
And more
Class creation and deletion
The this prefix
Should all member variables be private?
Classes with no member variables
Inheritance
Virtual functions
Final functions
Polymorphisms
When to use inheritance
Pointers to class instances
13. PROGRAMMING EXAMPLE: LINEAR REGRESSION 2
Introduction
LinReg in use
LinReg version-1 code
Adding OPG and robust variance estimates to LinReg
Aside on numerical accuracy: Order of addition
Aside on numerical accuracy: Symmetric matrices
Finishing the code
LinReg version-2 code
Certifying LinReg version 2
Adding LinReg version 2 to the lmatabook.mlib library
14. BETTER VARIABLE TYPES
Overview
Stata’s macros
Using macros to create new types
Macroed types you might use
The boolean type
The Code type
Filehandle
Idiosyncratic types, such as Filenames
Macroed types for structures
Macroed types for classes
Macroed types to avoid name conflicts
15. PROGRAMMING CONSTANTS
Problem and solution
How to define constants
How to use constants
Where to place constant definitions
16. MATA’S ASSOCIATIVE ARRAYS
Introduction
Using class AssociativeArray
Finding out more about AssociativeArray
17. PROGRAMMING EXAMPLE: SPARSE MATRICES
Introduction
The idea
Design
Producing a design from an idea
The design goes bad
Fixing the design
Sketches of R_*x*() and S_*x*() subroutines
Sketches of class’s multiplication functions
Design summary
Design shortcomings
Code
Certification script
18. PROGRAMMING EXAMPLE: SPARSE MATRICES, CONTINUED
Introduction
Making overall timings
Timing T1, Mata R=RR
Timing T2, SpMat R=RR
Timing T3, SpMat R=SR
Timing T4, SpMat R=RS
Timing T5, SpMat R=SS
Call a function once before timing
Summary
Making detailed timings
Mata’s timer() function
Make a copy of the code to be timed
Make a do-file to run the example to be timed
Add calls to timer_on() and timer_off() to the code
Analyze timing results
Developing better algorithms
Developing a new idea
Aside
Features of associative arrays
Advanced use of pointers
Converting the new idea into code sketches
Converting the idea into a sketch of R_SxS()
Sketching subroutine cols_of_row()
Converting sketches into completed code
Double-bang comments and messages
// NotReached comments
Back to converting sketches
Measuring performance
Cleaning up
Finishing R_SxS() and cols_of_row()
Running certification
Continuing development
19. THE MATA REFERENCE MANUAL
A. Writing Mata code to add new commands to Stata
Overview
Ways to structure code
Accessing Stata’s data from Mata
Handling errors
Making the calculation and displaying results
Returning results
The Stata interface functions
Accessing Stata’s data
Modifying Stata’s data
Accessing and modifying Stata’s metadata
Changing Stata’s dataset
Accessing and modifying Stata macros, scalars, matrices
Executing Stata commands from Mata
Other Stata interface functions
B. Mata’s storage type for complex numbers
Complex values
Complex values and literals
Complex scalars, vectors, and matrices
Real, complex, and numeric eltypes
Functions Re(), Im(), and C()
Function eltype()
C. How Mata differs from C and C++
Introduction
Treatment of semicolons
Nested comments
Argument passing
Strings are not arrays of characters
Pointers
Pointers to existing objects
Pointers to new objects, allocation of memory
The size and even type of the object may change
Pointers to new objects, freeing of memory
Pointers to subscripted values
Pointer arithmetic is not allowed
Lack of switch/case statements
Mata code aborts with error when C would crash
D. Three-dimensional arrays (advanced use of pointers)
Introduction
Creating three-dimensional arrays
References
Author index
Subject index