In a recent article, we reviewed the impact of removing the intercept from a regression model when the predictor variable is categorical. This month we’re going to talk about removing the intercept when the predictor variable is continuous.
Spoiler alert: You should never remove the intercept when a predictor variable is continuous.
Here’s why. (more…)
Suppose you are asked to create a model that will predict who will drop out of a program your organization offers. You decide to use a binary logistic regression because your outcome has two values: “0” for not dropping out and “1” for dropping out.
Most of us were trained in building models for the purpose of understanding and explaining the relationships between an outcome and a set of predictors. But model building works differently for purely predictive models. Where do we go from here? (more…)
When I was in graduate school, stat professors would say “ANOVA is just a special case of linear regression.” But they never explained why.
And I couldn’t figure it out.
The model notation is different.
The output looks different.
The vocabulary is different.
The focus of what we’re testing is completely different. How can they be the same model?
(more…)
Have you ever heard that “2 tall parents will have shorter children”?
This phenomenon, known as regression to the mean, has been used to explain everything from patterns in hereditary stature (as Galton first did in 1886) to why movie sequels or sophomore albums so often flop.
So just what is regression to the mean (RTM)? (more…)
When you put a continuous predictor into a linear regression model, you assume it has a constant relationship with the dependent variable along the predictor’s range. But how can you be certain? What is the best way to measure this?
And most important, what should you do if it clearly isn’t the case?
Let’s explore a few options for capturing a non-linear relationship between X and Y within a linear regression (yes, really). (more…)
In a simple linear regression model, how the constant (a.k.a., intercept) is interpreted depends upon the type of predictor (independent) variable.
If the predictor is categorical and dummy-coded, the constant is the mean value of the outcome variable for the reference category only. If the predictor variable is continuous, the constant equals the predicted value of the outcome variable when the predictor variable equals zero.
Removing the Constant When the Predictor Is Categorical
When your predictor variable X is categorical, the results are logical. Let’s look at an example. (more…)