There are many statistical concepts that are easy to confuse.
Sometimes the problem is the terminology. We have a whole series of articles on Confusing Statistical Terms.
But in these cases, it’s the concepts themselves. Similar, but distinct concepts that are easy to confuse.
Some of these are quite high-level, and others are fundamental. For each article, I’ve noted the Stage of Statistical Skill at which you’d encounter it.
So in this series of articles, I hope to disentangle some of those similar, but distinct concepts in an intuitive way.
Stage 1 Statistical Concepts
The Difference Between:
Stage 2 Statistical Concepts
The Difference Between:
- Interaction and Association
- Crossed and Nested Factors
- Truncated and Censored Data
- Eta Squared and Partial Eta Squared
- Missing at Random and Missing Completely at Random Missing Data
- Model Assumptions, Inference Assumptions, and Data Issues
- Model Building in Explanatory and Predictive Models
Stage 3 Statistical Concepts
The Difference Between:
- Relative Risk and Odds Ratios
- Logistic and Probit Regression
- Link Functions and Data Transformations
- Clustered, Longitudinal, and Repeated Measures Data
- Random Factors and Random Effects
- Repeated Measures ANOVA and Linear Mixed Models
- Principal Component Analysis and Factor Analysis
- Confirmatory and Exploratory Factor Analysis
- Moderation and Mediation
Are there concepts you get mixed up? Please leave it in the comments and I’ll add to my list.
Jerry says
Thanks for these clarification articles! Another one might be the use of “N” versus “N – 1” in the denominator. It’s been a while since I examined these in any detail, but another stats blogger recently suggested that the use of “N” is for a sample, and “N – 1” is for the population. And of course, the “population” could itself just as easily be a sample unless you measure the entire world! Thanks Karen!
harvey motulsky says
multivariate vs. multiple independent variables
Sriram Ramachandran says
There is also lot of confusion between odds ratio and hazard ratio(cox regression survival analysis), Correlation (scatterplot)and Agreement(blant-altman plot) you can add to your list
Karen Grace-Martin says
Thanks, Sriram. Great ideas!
William Peck says
just what the Stats doctor ordered! Very good, excited to review.
I mostly figured out correlation (i.e., Pearson’s correlation, both in Excel and SPSS, and a friend corroborated my work in Stata. Correlation is relatively easy to understand, even to the layman.
But Regression always conjures up a blank cloud over my head. It’s now anything a layman can understand imo. Plus it just doesn’t click with me. Although I have done Logistic Regression in SPSS to good effect, to identify a cohort of college students coming straight from h.s. who are similar to those who go to a college prep school, then we compare how well the two groups did, in terms of GPA, Calculus, Chemistry, Physics, and English.
but Regression in general doesn’t resonate with me … so if you have anything on that, send the link.
Thank you!
Karen Grace-Martin says
Hi William,
We have a lot of resources on regression, though I’m not sure much on the fundamental idea of what it is. 🙂
https://www.theanalysisfactor.com/resources/by-topic/linear-regression/
At it’s most basic, a linear regression is simply the equation of the line that best describes the linear relationship. It is very related to correlation, in that it’s simply the line that best describes the linear relationship in the correlation. Where it gets really complicated is if you have more than one predictor.
As I write out this “simple explanation” I realize I’m going to need to write something longer. Keep your eye out for a new article. 🙂