TABLES OF DESCRIPTIVE STATISTICS


OVERVIEW

When researchers publish their work, it is typical to include a table of descriptive statistics, commonly known as a “Table 1”; this provides the reader with some information about the sample. For example, one may want to present some demographics, such as average age and average income. One might also compare these characteristics across groups, such as regions or fields of occupation.

 

In Stata 18, researchers can use dtable to create these and many other variations of a “Table 1” and export them to many formats. For instance, one can create a table and export it to Excel:

 

Or we can create a table and export to HTML,

 

Word,

 

 

or PDF.

 

Also, because dtable is built on the collect suite of commands, which is designed for customizing any type of table, you can use the collect commands to further customize the look of your tables after you have created them using dtable.

 

DTABLE IN ACTION!

With dtable, creating a table of descriptive statistics can be as easy as specifying the variables you want in your table. dtable is designed so that you can create and export a table to various formats in one step. Means and standard deviations will be reported for continuous variables, and counts and percentages will be reported for factor variables. If this is all you need, you’ll simply specify the filename and file format to which you want to export, and you’ll be done. But of course, you can customize the table by including a range of statistics, performing different tests to check for equality between groups, adding a title and notes, and making other modifications.

 

In the examples below, we discuss how to create, modify, and export tables of descriptive statistics.

 

A SIMPLE EXAMPLE

To get started, we load data from the Second National Health and Nutrition Examination Survey (NHANES II) (McDowell et al. 1981). We want summary statistics for the variables specified but separately for each category of diabetes. Note that we use factor-variable notation to indicate that sex and race are categorical variables. And we export the table to the file table1.html:

 

This is how easy it can be to create and export a table. If you prefer to export this table to another format, such as LaTeX or PDF, just specify the appropriate file extension. While this is an informative table, we will further customize it below.

 

By default, dtable reports means and standard deviations for continuous variables and counts and percentages for factor variables. We can see that roughly 95% of our sample is not diabetic while 5% is. We also see that the average systolic blood pressure for the nondiabetic group is 130 and the average for the diabetic group is 147. We can test whether mean systolic blood pressure differs across diabetic status. We can also test whether sex and diabetic status are independent. We could report tests comparing all variables across diabetic status groups. However, we’re not interested in comparing the age or racial composition for the two groups, so we suppress these tests for age and race. We also suppress the descriptive statistics for the overall sample, the column labeled “Total”:

 

© Copyright 1996–2024 StataCorp LLC. All rights reserved.

Next we specify the suboption place(seplabels) to place the frequency for each subsample in the column label but on a separate row. By default, frequencies and percentages are both reported for each group, but we want only the frequencies. Furthermore, we can modify how our statistics are displayed by using the sformat() option; here we enclose the counts in parentheses. We’ll also add a note with the total sample size and hide the label for the by() variable from the column header.

 

Finally, we format the means and standard deviations to two decimal places, add a title, and export our final table to an HTML file:

 

Defining your own statistics

dtable offers a wide range of statistics that you can include in your table, such as the coefficient of variation, geometric mean, skewness, and many more. However, you may want to combine statistics in one cell. For example, you might place the interquartile range right beside the median. With dtable, you can create composite results from any of the supported statistics.

 

Below, we want to compare some health-related measurements across individuals who have had a heart attack with those who haven’t. We want means and standard deviations in one cell, separated by the plus-minus sign. Additionally, we want the interquartile range placed right next to the median. We define each of these composite results and specify the delimiter. Then we specify the statistics we want for each variable. Also, we add the svy option so that the statistics will be computed using survey weights.

 

Next we’ll format the percentiles and the medians to zero decimal places, and we’ll place the interquartile range in parentheses. By default, standard deviations are placed in parentheses, but we remove the parentheses below:

 

Finally, we remove the label for the by() variable, change the label for the “Total” column, and change the label for the p-values:

 

FURTHER CUSTOMIZATION USING COLLECT

While dtable is designed to create and export a table in one step, you can customize these tables further with the collect suite of commands. Below, we change the color of the borders and the background color for alternating rows in our table:

 

. collect style cell border_block[column-header corner],
border(top, color(cyan))
. * Change the border color above the corner and column headers to cyan

. collect style cell border_block[row-header item],
border(bottom, color(cyan))
border(top, color(cyan))
. * Change the border color above and below the results and row headers to cyan

. collect style cell cell_type[column-header], font(, bold)
. * Make the column headers bold

. collect style cell var[_N bmi tcresult], shading(background(cyan))
. /*
Change the background color to cyan for the rows corresponding to N,
BMI, and cholesterol
*/

 

Finally, we specify that the width of the columns be resized to fit the table contents and export the table:

 

. collect style putdocx, layout(autofitcontents)

. collect export table2.docx, replace
(collection DTable exported to file table2.docx)

Here is our resulting document: