In our last two posts, we explained (1) that every member of a simple random sample had an equal probability of selection and (2) that there are some really good reasons why complex samples can work better, despite being more complex.
Today, we’re going to talk a bit about one complex sampling technique: stratified sampling.
What is Stratified Sampling?
In stratified sampling, the target population is first classified into subgroups or strata. (Grammar note: “strata” is plural for “stratum” just as “data” is plural for “datum.”).
A simple random sample is then selected within every stratum.
That’s it.
For example, let’s say you’re doing a linguistics study within the US. You want to make sure that you have enough (more…)
In our last article, we talked about simple random samples. Simple random samples are, well…simple, but they’re not always optimal or even possible.
Probability samples that don’t meet the assumptions of Simple Random Samples are called Complex Samples.
You’ll also hear the term Complex Survey, which is really just a survey that incorporates some sort of complex sampling design. Because of their size and research goals, surveys are usually* the only type of research study that uses complex samples.
(*but not always. I have seen intervention studies, for example, that used complex sampling).
What is a Complex Sample?
The most defining feature of a complex sample is that sample members do not have equal probability of being selected.
That sounds simple enough. But… (more…)
There are two oft-cited assumptions for Analysis of Covariance (ANCOVA), which is used to assess the effect of a categorical independent variable on a numerical dependent variable while controlling for a numerical covariate:
1. The independent variable and the covariate are independent of each other.
2. There is no interaction between independent variable and the covariate.
In a previous post, I showed a detailed example for an observational study where the first assumption is irrelevant, but I have gotten a number of questions about the second.
So what does it mean, and what should you do, if you find an interaction between the categorical IV and the continuous covariate? (more…)
I sometimes get asked questions that many people need the answer to. Here’s one about non-parametric ANOVA in SPSS.
Question:
Is there a non-parametric 3 way ANOVA out there and does SPSS have a way of doing a non-parametric anova sort of thing with one main independent variable and 2 highly influential cofactors?
Quick Answer:
No.
Detailed Answer:
There is a non-parametric one-way ANOVA: Kruskal-Wallis, and it’s available in SPSS under non-parametric tests. There is even a non-paramteric two-way ANOVA, but it doesn’t include interactions (and for the life of me, I can’t remember its name, but I remember learning it in grad school).
But there is no non-parametric factorial ANOVA, and it’s because of the nature of interactions and most non-parametrics.
What it basically comes down to is that most non-parametric tests are rank-based. In other words, (more…)
Every so often I point out to a client who exclusively uses menus in SPSS that they can (and should) hit the Paste button instead of OK. Many times, the client never realized it was there.
I am here today to tell you that it is there, and it is wonderful. For a few reasons.
When you use the menus in SPSS, you’re really taking a shortcut. You’re telling SPSS which syntax commands, along with which options, you want to run.
Clicking OK at the end of a dialog box will run the menu options you just picked. You may never see the underlying commands that SPSS just ran.
If instead you hit Paste, those command won’t automatically be run, but will instead the code to run those commands will be (more…)
One of the things I love about MIXED in SPSS is that the syntax is very similar to GLM. So anyone who is used to the GLM syntax has just a short jump to learn writing MIXED.
Which is a good thing, because many of the concepts are a big jump.
And because the MIXED dialogue menus are seriously unintuitive, I’ve concluded you’re much better off using syntax.
I was very happy a few years ago when, with version 19, SPSS finally introduced generalized linear mixed models so SPSS users could finally run logistic regression or count models on clustered data.
But then I tried it, and the menus are even less intuitive than in MIXED.
And the syntax isn’t much better. In this case, the syntax structure is quite different than for MIXED. (more…)