Centering a covariate –a continuous predictor variable–can make regression coefficients much more interpretable. That’s a big advantage, particularly when you have many coefficients to interpret. Or when you’ve included terms that are tricky to interpret, like interactions or quadratic terms.
For example, say you had one categorical predictor with 4 categories and one continuous covariate, plus an interaction between them.
First, you’ll notice that if you center your covariate at the mean, there is (more…)
The following statement might surprise you, but it’s true.
To run a linear model, you don’t need an outcome variable Y that’s normally distributed. Instead, you need a dependent variable that is:
- Continuous
- Unbounded
- Measured on an interval or ratio scale
The normality assumption is about the errors in the model, which have the same distribution as Y|X. It’s absolutely possible to have a skewed distribution of Y and a normal distribution of errors because of the effect of X. (more…)
What is a Confounder?
Confounder (also called confounding variable) is one of those statistical terms that confuses a lot of people. Not because it represents a confusing concept, but because of how it’s used.
(Well, it’s a bit of a confusing concept, but that’s not the worst part).
It has slightly different meanings to different types of researchers. The definition is essentially the same, but the research context can have specific implications for how that definition plays out.
If the person you’re talking to has a different understanding of what it means, you’re going to have a confusing conversation.
Let’s take a look at some examples to unpack this.
(more…)
Last week I had the pleasure of teaching a webinar on Interpreting Regression Coefficients. We walked through the output of a somewhat tricky regression model—it included two dummy-coded categorical variables, a covariate, and a few interactions.
As always seems to happen, our audience asked an amazing number of great questions. (Seriously, I’ve had multiple guest instructors compliment me on our audience and their thoughtful questions.)
We had so many that although I spent about 40 minutes answering (more…)
Predictor variables in statistical models can be treated as either continuous or categorical.
Usually, this is a very straightforward decision.
Categorical predictors, like treatment group, marital status, or highest educational degree should be specified as categorical.
Likewise, continuous predictors, like age, systolic blood pressure, or percentage of ground cover should be specified as continuous.
But there are numerical predictors that aren’t continuous. And these can sometimes make sense to treat as continuous and sometimes make sense as categorical.
(more…)