The following statement might surprise you, but it’s true.
To run a linear model, you don’t need an outcome variable Y that’s normally distributed. Instead, you need a dependent variable that is:
- Continuous
- Unbounded
- Measured on an interval or ratio scale
The normality assumption is about the errors in the model, which have the same distribution as Y|X. It’s absolutely possible to have a skewed distribution of Y and a normal distribution of errors because of the effect of X. (more…)
Predicting future outcomes, the next steps in a process, or the best choice(s) from an array of possibilities are all essential needs in many fields. The predictive model is used as a decision making tool in advertising and marketing, meteorology, economics, insurance, health care, engineering, and would probably be useful in your work too! (more…)
A great tool to have in your statistical tool belt is logistic regression.
It comes in many varieties and many of us are familiar with the variety for binary outcomes.
But multinomial and ordinal varieties of logistic regression are also incredibly useful and worth knowing.
They can be tricky to decide between in practice, however. In some — but not all — situations you (more…)
Multicollinearity can affect any regression model with more than one predictor. It occurs when two or more predictor variables overlap so much in what they measure that their effects are indistinguishable.
When the model tries to estimate their unique effects, it goes wonky (yes, that’s a technical term).
So for example, you may be interested in understanding the separate effects of altitude and temperature on the growth of a certain species of mountain tree.
(more…)
Most of us know that binary logistic regression is appropriate when the outcome variable has two possible outcomes: success and failure.
There are two more situations that are also appropriate for binary logistic regression, but they don’t always look like they should be.
(more…)