guest contributer

Getting Started with Stata Tutorial #7: Importing Data into Stata

December 10th, 2024 by

In our previous posts, we’ve relied on Stata’s pre-loaded datasets to perform analyses. But when you’re working with your own data, you’ll need to know how to import it into Stata.

To demonstrate how this process works, we will use the Iris dataset from UCI.

Download the dataset, then move it to whichever directory you intend to use for Stata files.

There are three main ways of importing data in Stata: either use the menus to import the data, call the dataset by its full file extension, or change your directory to the one with your data and then refer to the dataset by name. (more…)


When to Report Separate Group or a Pooled Mean

November 12th, 2024 by

Have you ever wondered whether you should report separate means for different groups or a pooled mean from the entire sample? This is a common scenario that comes up, for instance in deciding whether to separate by sex, region, observed treatment, et cetera.

(more…)


Getting Started with Stata Tutorial #6: How Stata Code Works

July 18th, 2024 by

If you’ve tried coding in Stata, you may have found it strange. The syntax rules are straightforward, but different from what I’d expect.

I had experience coding in Java and R before I ever used Stata. Because of this, I expected commands to be followed by parentheses, and for this to make it easy to read the code’s structure.

Stata does not work this way.

An Example of how Stata Code Works

To see the way Stata handles a linear regression, go to the command line and type

h reg or help regress

You will see a help page pop up, with this Syntax line near the top.

(If you need a refresher on getting help in Stata, watch this video by Jeff Meyer.)

This is typical of how Stata code looks. (more…)


Member Training: Introduction to Structural Equation Modeling

June 1st, 2024 by

Structural Equation Modeling (SEM) is a popular method to test hypothetical relationships between constructs in the social sciences. These constructs may be unobserved (a.k.a., “latent”) or observed (a.k.a., “manifest”).

In this training, you will learn the different types of SEM: confirmatory factor analysis, path analysis for manifest and latent variables, and latent growth modeling (i.e., the application of SEM on longitudinal data).

We’ll discuss the different terminology, the commonly used symbols, and the different ways a model can be specified, as well as how to present results and evaluate the fit of the models.

This training will be at a very basic conceptual level; however, it is assumed that participants have an understanding of multiple regression, interpretation of statistical tests, and methods of data screening.


Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.

(more…)


Getting Started with Stata Tutorial #5: The Stata Do-File

May 4th, 2024 by

From our first Getting Started with Stata posts, you should be comfortable navigating the windows and menus of Stata. We can now get into  programming in Stata with a do-file.

Why Do-Files?

A do-file is a Stata file that provides a list of commands to run. You can run an entire do-file at once, or you can highlight and run particular lines from the file.

If you set up your do-file correctly, you can just click “run” after opening it. The do-file will set you to the correct directory, open your dataset, do all analyses, and save any graphs or results you want saved.

I’ll start off by saying this: Any analysis you want to run in Stata can be run without a do-file, just using menus and individual commands in the command window. But you still should make a do-file for the following reason:

Reproducibility (more…)


Too Many Colors Spoil the Graph

March 26th, 2024 by

When you draw a graph- either a bar chart, a scatter plot, or even a pie chart, you have the choice of a broad range of colors that you can use. R, for example, has 657 different colors from aliceblue to yellowgreen. SAS has 13 shades of orange, 33 shades of blue, and 47 shades of green. They even have different shades of black.

You have a wealth of colors, but you can’t use all of them in the same graph. The ideal number of colors is 2.

(more…)