What is a Confounder?
Confounder (also called confounding variable) is one of those statistical terms that confuses a lot of people. Not because it represents a confusing concept, but because of how it’s used.
(Well, it’s a bit of a confusing concept, but that’s not the worst part).
It has slightly different meanings to different types of researchers. The definition is essentially the same, but the research context can have specific implications for how that definition plays out.
If the person you’re talking to has a different understanding of what it means, you’re going to have a confusing conversation.
Let’s take a look at some examples to unpack this.
(more…)
One of the key concepts in Survival Analysis is the Hazard Function.
But like a lot of concepts in Survival Analysis, the concept of “hazard” is similar, but not exactly the same as, its meaning in everyday English. Since it’s so important, though, let’s take a look. (more…)
What are the best methods for checking a generalized linear mixed model (GLMM) for proper fit?
This question comes up frequently.
Unfortunately, it isn’t as straightforward as it is for a general linear model.
In linear models the requirements are easy to outline: linear in the parameters, normally distributed and independent residuals, and homogeneity of variance (that is, similar variance at all values of all predictors).
(more…)
Survey questions are often structured without regard for ease of use within a statistical model.
Take for example a survey done by the Centers for Disease Control (CDC) regarding child births in the U.S. One of the variables in the data set is “interval since last pregnancy”. Here is a histogram of the results.
(more…)
A great tool to have in your statistical tool belt is logistic regression.
It comes in many varieties and many of us are familiar with the variety for binary outcomes.
But multinomial and ordinal varieties of logistic regression are also incredibly useful and worth knowing.
They can be tricky to decide between in practice, however. In some — but not all — situations you (more…)
Multicollinearity can affect any regression model with more than one predictor. It occurs when two or more predictor variables overlap so much in what they measure that their effects are indistinguishable.
When the model tries to estimate their unique effects, it goes wonky (yes, that’s a technical term).
So for example, you may be interested in understanding the separate effects of altitude and temperature on the growth of a certain species of mountain tree.
(more…)