In this video I will answer a question from a recent webinar Random Intercept and Random Slope Models.
We are answering questions here because we had over 500 people live on the webinar so we didn’t have time to get through all the questions.
If you missed the webinar live, this and the other questions in this series may make more sense if you watch that first. It was part of our free webinar series, The Craft of Statistical Analysis, and you can sign up to get the free recording, handout, and data set at this link:
http://TheCraftofStatisticalAnalysis.com/random-intercept-random-slope-models
Most analysts’ primary focus is to check the distributional assumptions with regards to residuals. They must be independent and identically distributed (i.i.d.)
with a mean of zero and constant variance.
Residuals can also give us insight into the quality of our models.
In this webinar, we’ll review and compare what residuals are in linear regression, ANOVA, and generalized linear models. Jeff will cover:
- Which residuals — standardized, studentized, Pearson, deviance, etc. — we use and why
- How to determine if distributional assumptions have been met
- How to use graphs to discover issues like non-linearity, omitted variables, and heteroskedasticity
Knowing how to piece this information together will improve your statistical modeling skills.
Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.
(more…)
One of the most common—and one of the trickiest—challenges in data analysis is deciding how to include multiple predictors in a model, especially when they’re related to each other.

Let’s say you are interested in studying the relationship between work spillover into personal time as a predictor of job burnout.
You have 5 categorical yes/no variables that indicate whether a particular symptom of work spillover is present (see below).
While you could use each individual variable, you’re not really interested if one in particular is related to the outcome. Perhaps it’s not really each symptom that’s important, but the idea that spillover is happening.
(more…)
One question that seems to come up pretty often is:
What is the difference between logistic and probit regression?
Well, let’s start with how they’re the same:
Both are types of generalized linear models. This means they have this form:

(more…)
We often talk about nested factors in mixed models — students nested in classes, observations nested within subject.
But in all but the simplest designs, it’s not that straightforward. (more…)
Here’s a common situation.
Your grant application or committee requires sample size estimates. It’s not the calculations that are hard (though they can be), it’s getting the information to fill into the calculations.
Every article you read on it says you need to either use pilot data or another similar study as a basis for the values to enter into the software.
You have neither.
No similar studies have ever used the scale you’re using for the dependent variable.
And while you’d love to run a pilot study, it’s just not possible. There are too many practical constraints — time, money, distance, ethics.
What do you do?
(more…)