A Note From Karen
I’d like to welcome all the new members of our Data Analysis Brown Bag program. We had a big influx during our open time last month and we look forward to working with you. The program is now closed, but if you missed out, we’ll be opening again in September.
We’d like to say congratulations to Maike Rahn, who has been doing some freelance statistical projects for us. Maike was recently offered a full time job that was just too ideal to pass up, and we wish her the best of luck in her new position.
We’ve already started talking with a few other freelancers to serve those of you who need statistical projects done for you. And we’ll fill you in as we know more.
Today’s article was inspired by a question I received via email. I can’t generally answer email questions directly, but sometimes do on The Analysis Factor site. This one brought back a great memory of figuring out a statistical mystery. I hope you enjoy.
Happy analyzing!
Karen
Feature Article: Opposite Results in Ordinal Logistic Regression—Solving a Statistical Mystery
A number of years ago when I was still working in the consulting office at Cornell, someone came in asking for help interpreting their ordinal logistic regression results.
The client was surprised because all the coefficients were backwards from what they expected, and they wanted to make sure they were interpreting them correctly.
It looked like the researcher had done everything correctly, but the results were definitely bizarre. They were using SPSS and the manual wasn’t clarifying anything for me, so I did the logical thing: I ran it in another software program. I wanted to make sure the problem was with interpretation, and not in some strange default or mistake in running the program.
I used Minitab. Lo and behold, /most/ of the parameter estimates were identical, but with signs opposite those from SPSS.
Now, if all of them had reversed sign, it would have been a simple matter of different defaults.
But the intercept was the same in both packages. Only the rest of the parameter estimates were backwards.
Curiouser and curiouser.
Luckily, in my office, we had access to at least four different packages, and we found (by now, the whole office was in on solving the mystery) that SPSS and Stata gave us one result and Minitab and SAS gave the other.
Here’s an example of ordinal logistic regression from SPSS and SAS output. For the record, SPSS uses “Threshold” for Intercept:
You can see that indeed, all the coefficients (Estimate column) are identical, but with opposite signs. Except the intercepts, which are identical.
Mystery Solved
As it turns out, SPSS and Stata did something clever.
SAS and Minitab parameterize the model in the usual way—the same way any regression model does:
It makes interpretation difficult though, because those Fijs represent cumulative probabilities.
Fi1 is the probability that Y = 1, the lowest ordered category.
Fi2 is the probability that Y ≤ 2, the lowest two ordered categories.
Fi3 is the probability that Y ≤ 3, the lowest three ordered categories, and so on.
Each odds ratio (exp(beta)) represents the factor increase in the odds of moving into a lower ordered category for each one-unit increase in X.
In other words, as X gets bigger, a positive beta means higher odds of a lower ordered category.
That is intuitively backward.
It’s not wrong. It just takes a little bit more thought to keep track of all the directions of all these variables.
To make your life just a little easier, SPSS and Stata run the model this way instead:
See that very important minus sign? It flips the estimation of all the coefficients other than the intercept.
In this version of the model, positive values of beta indicate higher odds of moving to the next higher ordered category for higher values of X.
Which makes a little more sense.
The lessons:
- Know your software defaults. It was a strange output that led us to look up this one, but not all incorrect results look strange.
- This will also be crucial if you use these results to calculate predicted probabilities.
- Knowing at least two software packages is a huge advantage. It helps you figure out if strange results are a software issue or a statistical one.
- Wow, my job is so fun. (Really).
Further Reading and Resources
Explaining Logistic Regression Results to Non-Statistical Audiences
Logistic Regression Models: Reversed odds ratios in SAS Proc Logistic–Use ‘Descending’
Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes |