What does it mean for two variables to be correlated?
Is that the same or different than if they’re associated or related?
This is the kind of question that can feel silly, but shouldn’t. It’s just a reflection of the confusing terminology used in statistics. In this case, the technical statistical term looks like, but is not exactly the same as, the way we mean it in everyday English. (more…)
Principal Component Analysis is really, really useful.
You use it to create a single index variable from a set of correlated variables.
In fact, the very first step in Principal Component Analysis is to create a correlation matrix (a.k.a., a table of bivariate correlations). The rest of the analysis is based on this correlation matrix.
You don’t usually see this step — it happens behind the scenes in your software.
Most PCA procedures calculate that first step using only one type of correlations: Pearson.
And that can be a problem. Pearson correlations assume all variables are normally distributed. That means they have to be truly (more…)