We finished the last article about Stata with the confusing coding of:
local continuous educat exper wage age
graph box `var’, saving(`var’,replace)
I admit it looks like a foreign language. Let me explain how simple it is to understand.
Stata allows you to use a single word, such as “continuous”, to represent many other words. In Stata this process is known as a macro.
Please note that a macro in Stata is not the same as a macro in Microsoft Excel.
Writing macros in Excel can be long and involved.
Writing a macro in Stata is very easy.
How to write a simple macro in Stata
A macro in Stata begins with the word “global” or “local”. The command global tells Stata to store everything in the command line in its memory until you exit Stata. If you open another data set before exiting, the global macro will still be in memory.
The command local tells Stata to keep everything in the command line in memory only until the program or do-file ends.
If you plan on analyzing only one data set then the global command shouldn’t cause you any problems.
If you will be analyzing more than one data set then you might want to consider using the local command.
If you remember from the previous article, the wages data set had four continuous variables.
In the first line of my code above, local continuous educat exper wage age, I am using the word “continuous” to represent the four variables educat, exper, wage, and age.
Some commands in Stata allow you to analyze more than one variable at a time. For example, I might want to run the following commands in order to see what my continuous variables look like.
tab1 educat exper wage age
codebook educat exper wage age
summarize educat exper wage age
But why bother with a macro?
Using a macro allows me to simplify my work, which will reduce the potential for errors and keep it organized.
The code below will perform the exact same actions as the code above.
local continuous educat exper wage age
tab1 `continuous’
codebook `continuous’
summarize `continuous’
The command local tells Stata to use the word continuous to represent the variables educat, exper, wage, and age. I then substitute the word continuous for the variable names in the the last lines.
Note that each time you use the word continuous in a command line you must begin with the forward slanting single quote key ` (to the left of the 1 key on your keyboard) and finish with the backward slanting single quote key ‘ (to the right of the ; key on your keyboard).
Using a macro to represent several variables may not seem like a big deal and why bother with it. But wait, there is more.
For example, you might have to run numerous linear regressions, using several of the same predictor variables in each regression you run. After reviewing your results you might decide to eliminate one of the variables.
If you didn’t use a macro you will have to go back to every line of code and remove it. If you did use a macro you will only have to go back to the line of code for the macro and remove the variable from the group.
Remove it once and it is gone. Que facil!
Macros for formatting tables and graphs
Another great use for macros is for creating tables and graphs. Unless you are Raymond (of Rain Man fame), there is no way you will remember your favorite options when creating a graph.
What are options for a graph? To name just a few, the major and minor tick labels for the X and Y axis, the X and Y axis scale properties, whether to have a legend, placement of the legend, placement of the title, etc. etc.
This is an example of the coding for a graph that I once created:
ms(O) mc(gs0) msize(small)) (line populatn_hat d, sort lcolor(gs0)) (line populatn_hat_p d, sort lcolor(gs0) lpattern(dash)) (line populatn_hat_m d, sort lcolor(gs0) lpattern(dash)), xline(0, lcolor(gs0)) title(“Quadratic fit”, color(gs0)) ytitle(“Population in district”) legend(label(1 “Vote shares”) label(2 “Quadratic fit”))
It would sure be easier to use a one word macro to represent all of the above.
I suggest you create a do-file template for tables and graphs (see my previous article on creating do-file templates).
In the template you create various macros that contain the options for your tables and graphs.
Be sure to document (by using // or “*” as discussed in the previous article) in your do-file what the table or graph will look like.
Perhaps the best reason for using the macros for your table and graph formatting is it ensures consistent formatting.
If you decide you want to tweak how the tables in your research paper look, you will only need to make a change to the macro. This saves you from having to go back to every table and change the coding.
In my next article I will show you another way to save time and effort in Stata through looping.
Meghan Shirley Bezerra says
Hi Jeff. Thanks for this helpful post.
I’m working with a dataset with >100 continuous variables. I tried out just assigning 3 of these variables to a macro as you illustrated, so:
local continuous age fevpp fasting_glucose
But then when I tried:
codebook `continuous’, compact
summarize `continuous’
it gave me all continuous variables in the dataset, no just the 3 specified in the macro.
Any idea what I might have done wrong?
Thanks!
Jeff Meyer says
There are 2 possible causes that I can think of. Do you have a global macro named continuous as well that contains all of the continuous variables in your data set? This shouldn’t be the cause because a global macro would require a “$” in fromt of continuous. The only other reason might be that you have to highlight all of the lines of code from the line local continuous age fevpp fasting_glucose to the line summarize `continuous’ in order to enact the local macro.
Caitlin Klaassen says
Hey all I am trying to update a global directory for a replication project but I am unsure how to do this could someone help ?
Miranda says
For macros and loops, is it possible to define large blocks of variables that are not contiguous in the dataset? As a simple example, let’s say I have a dataset that examines the look, feel, and taste of apples, bananas, and oranges with a varlist as follows:
v1= apple look q1
v2= apple look q2
v3= apple look q3
v4= apple feel q1
v5= apple feel q2
v6= apple feel q3
v7= apple taste q1
v8= apple taste q2
v9= apple taste q3
v10-18 = bananas
v19-27 = oranges
Lets say I’m only interested in taste, and I want to create a macro or a loop to route out miscoded cases among those variables. I want to create a single macro that includes the following variable groups: v7-v9, v16-v18, and v25-v27. Is there a straightforward way to do this without having to type out each variable?
Jeff Meyer says
You can use an if statement within your loop.
Mekdes says
I found it helpful. Thank you so much.
irina says
thank you for such a simple explanation with great examples! Saved in my personal notes. Thank you sooooo much!
Moses Otieno says
This is awesome! Have learnt thoroughly on macros!!!!