|
An Introduction to Stata Programming by Christopher F. Baum
Comment from the Stata technical group
Christopher F. Baums An Introduction to Stata Programming
is worthwhile for anyone wanting to learn about programming in Stata.
For the beginner, Baum assumes only that the user familiar with Stata,
and so he builds up accordingly. For the more advanced Stata
programmer, the book introduces Stata's Mata
programming language and provides optimization tips for day-to-day
work. All readers will find better, new ways to approach old tasks.
Baum steps the reader through the three levels of Stata
programming. First up are do-files. Though often thought of as
simple batch files, do-files support both loops and conditional
execution, and hence can be used for automation as well as
reproducibility. While giving examples of do-file programming, Baum
introduces useful but often-overlooked Stata constructions.
Next come ado-files, which are used
to extend Stata by creating new commands that share the syntax and
behavior of official commands. Baum gives an example of how to write a
simple additional command for Stata, complete with documentation and
certification. After writing the simple command, users can then learn
how to write their own custom estimation commands by using both Statas
built-in numerical maximum-likelihood estimation routine, ml, and its
built-in nonlinear least-squares routines, nl and
nlsur.
Finishing
up the book are two chapters on programming in Mata, which is Stata's
matrix programming language. Mata programs are integrated into
ado-files to build a custom estimation routine that is optimized for
speed and numerical stability. While stepping through these structures,
Baum weaves in the details that are needed to become an expert at Stata
programming, so readers will also learn more about Stata itself while
learning the tools for programming.
Baum approaches each topic by first explaining the background and
need for the topic, then looking at the basic usage and examples,
and finally examining use within larger, more applied cookbook
examples. Many of his examples come from questions posed on the
Statalist listserver, so they address complexities of nterest to a
broad range of Stata users. The programming examples cover an array
of topics, illustrate some of Stata's built-in tools (such as the
resampling techniques of bootstrapping and jackknifing), and offer solutions to tricky data management questions.
The breadth and depth of this book make it a necessity for anyone interested in programming in Stata.
Table of contents
List of tables
List of figures
Acknowledgments
Notation and typography
1 Why should you become a Stata programmer?
- Do-file programming
- Ado-file programming
- Mata programming for ado-files
- 1.1 Plan of the book
- 1.2 Installing the necessary software
2 Some elementary concepts and tools
- 2.1 Introduction
- 2.1.1 What you should learn from this chapter
- 2.2 Navigational and organizational issues
- 2.2.1 The current working directory and profile.do
- 2.2.2 Locating important directories: sysdir and adopath
- 2.2.3 Organization of do-files, ado-files, and data files
- 2.3 Editing Stata do- and ado-files
- 2.4 Data types
- 2.4.1 Storing data efficiently: The compress command
- 2.4.2 Date and time handling
- 2.4.3 Time-series operators
- 2.5 Handling errors: The capture command
- 2.6 Protecting the data in memory: The preserve and restore commands
- 2.7 Getting your data into Stata
- 2.7.1 Inputting data from ASCII text files and spreadsheets
- Handling text files
- Free format versus fixed format
- The insheet command
- Accessing data stored in spreadsheets
- Fixed-format data files
- 2.7.2 Importing data from other package formats
- 2.8 Guidelines for Stata do-file programming style
- 2.8.1 Basic guidelines for do-file writers
- 2.8.2 Enhancing speed and efficiency
- 2.9 How to seek help for Stata programming
3 Do-file programming: Functions, macros, scalars, and matrices
- 3.1 Introduction
- 3.1.1 What you should learn from this chapter
- 3.2 Some general programming details
- 3.2.1 The varlist
- 3.2.2 The numlist
- 3.2.3 The if exp and in range qualifiers
- 3.2.4 Missing data handling
- Recoding missing values: The mvdecode and mvencode commands
- 3.2.5 String-to-numeric conversion and vice versa
- Numeric-to-string conversion
- Working with quoted strings
- 3.3 Functions for the generate command
- 3.3.1 Using if exp with indicator variables
- 3.3.2 The cond() function
- 3.3.3 Recoding discrete and continuous variables
- 3.4 Functions for the egen command
- Official egen functions
- egen functions from the user community
- 3.5 Computation for by-groups
- 3.5.1 Observation numbering: _n and _N
- 3.6 Local macros
- 3.7 Global macros
- 3.8 Extended macro functions and macro list functions
- 3.8.1 System parameters, settings, and constants:creturn
- 3.9 Scalars
- 3.10 Matrices
4 Cookbook: Do-file programming I
4.1 Tabulating a logical condition across a set of variables
4.2 Computing summary statistics over groups
4.3 Computing the extreme values of a sequence
4.4 Computing the length of spells
4.5 Summarizing group characteristics over observations
4.6 Using global macros to set up your environment
4.7 List manipulation with extended macro functions
4.8 Using creturn values to document your work
- Do-file programming: Validation, results, and data management
5.1 Introduction
5.1.1 What you should learn from this chapter
5.2 Data validation: The assert, count, and duplicates commands
5.3 Reusing computed results: The return and ereturn commands
- 5.3.1 The ereturn list command
5.4 Storing, saving, and using estimated results
- 5.4.1 Generating publication-quality tables from stored estimates
5.5 Reorganizing datasets with the reshape command
5.6 Combining datasets
5.7 Combining datasets with the append command
5.8 Combining datasets with the merge command
- 5.8.1 The dangers of many-to-many merges
5.9 Other data-management commands
- 5.9.1 The fillin command
- 5.9.2 The cross command
- 5.9.3 The stack command
- 5.9.4 The separate command
- 5.9.5 The joinby command
- 5.9.6 The xpose command
6 Cookbook: Do-file programming II
6.1 Efficiently defining group characteristics and subsets
- 6.1.1 Using a complicated criterion to a subset of observations
6.2 Applying reshape repeatedly
6.3 Handling time-series data effectively
6.4 reshape to perform rowwise computation
6.5 Adding computed statistics to presentation-quality tables
- 6.5.1 Presenting marginal effects rather than coefficients
6.6 Generating time-series data at a lower frequency
7 Do-file programming: Prefixes, loops, and lists
- 7.1 Introduction
- 7.1.1 What you should learn from this chapter
- 7.2 Prefix commands
- 7.2.1 The by prefix
- 7.2.2 The xi prefix
- 7.2.3 The statsby prefix
- 7.2.4 The rolling prefix
- 7.2.5 The simulate and permute prefix
- 7.2.6 The bootstrap and Jackknife prefixes
- 7.2.7 Other prefix commands
- 7.3 The forvalues and foreach commands
8 Cookbook: Do-file programming III
8.1 Handling parallel lists
8.2 Calculating moving-window summary statistics
- 8.2.1 Producing summary statistics with rolling and merge
8.2.2 Calculating moving-window correlations
8.3 Computing monthly statistics from daily data
8.4 requiring at least n observations per panel unit
8.5 Counting the Number of distinct values per individual
9 Do-file programming: Other topics
- 9.1 Introduction
- 9.1.1 What you should learn from this chapter
- 9.2 Storing results in Stata matrices
- 9.3 The post and postfile commands
- 9.4 Output: The outsheet, outfile, and commands
- 9.5 Automating estimation output
- 9.6 Automating graphics
- 9.7 Characteristics
10 Cookbook: Do-file programming IV
- 10.1 Computing firm-level correlations with multiple indices
- 10.2 Computing marginal effects for graphical presentation
- 10.3 Automating the production of LATEX tables
- 10.4 Tabulating downloads from the Statistical Software Components archive
- 10.5 Extracting data from graph files sersets
- 10.6 Constructing continuous price and returns series
11 Ado-file programming
11.1 Introduction
- 11.1.1 What you should learn from this chapter
11.2 The structure of a Stata program
11.3 The program statement
11.4 The syntax and return statements
11.5 Implementing program options
11.6 Including a subset of observations
11.7 Generalizing the command to handle multiple variables
11.8 Making commands byable
- Program properties
11.9 Documenting your program
11.10 egen function programs
11.11 Writing an e-class program
- 11.11.1 Defining subprograms
11.12 Certifying your program
11.13 Programs for ml, nl, nlsur, simulate, bootstrap, and jackknife
- Writing an ml-based command
11.13.1 Programs for the nl and nlsur commands
11.13.2 Programs for the simulate, bootstrap, and jackknife prefixes
11.14 Guidelines for Stata ado-file programming style
11.14.1 Presentation
11.14.2 Helpful Stata features
11.14.3 Respect for datasets
11.14.4 Speed and efficiency
11.14.5 Reminders
11.14.6 Style in the large
11.14.7 Use the best tools
12 Cookbook: Ado-file programming
12.1 Retrieving results from rolling:
12.2 Generalization of egen function pct9010() to support all pairs of quantiles
12.3 Constructing a certification script =
12.4 Using the ml command to estimate means and variances
- 12.4.1 Applying equality constraints in ml estimation
12.5 Applying inequality constraints in ml estimation
12.6 Generating a dataset containing the single longest spell
13 Mata functions for ado-file programming
13.1 Mata: First principles
- 13.1.1 What you should learn from this chapter
13.2 Mata fundamentals
- 13.2.1 Operators
13.2.2 Relational and logical operators
- 13.2.3 Subscripts
- 13.2.4 Populating matrix elements
- 13.2.5 Mata loop commands
- 13.2.6 Conditional statements
13.3 Function components
- 13.3.1 Arguments
- 13.3.2 Variables
- 13.3.3 Saved results
13.4 Calling Mata functions
13.5 Mata st_ interface functions
- 13.5.1 Data access
- 13.5.2 Access to locals, globals, scalars, and matrices
- 13.5.3 Access to Stata variablesattributes
13.6 Example: st_ interface function usage
13.7 Example: Matrix operations
- 13.7.1 Extending the command
13.8 Creating arrays of temporary objects with pointers
13.9 Structures
13.10 Additional Mata features
- 13.10.1 Macros in Mata functions
- 13.10.2 Compiling Mata functions
- 13.10.3 Building and maintaining an object library
- 13.10.4 A useful collection of Mata routines
14 Cookbook: Mata function programming
- 14.1 Reversing the rows or columns of a Stata matrix
- 14.2 Shuffling the elements of a string variable
- 14.3 Firm-level correlations with multiple indices with Mata
- 14.4 Passing a function to a Mata function
- 14.5 Using subviews in Mata
- 14.6 Storing and retrieving country-level data with Mata structures
- 14.7 Locating nearest neighbors with Mata
- 14.8 Computing the seemingly unrelated regression estimator
- 14.9 GMM-CUE estimator using Mata's optimize() functions
References
Author Index
Subject Index


|
|