Author: Trent Buskirk, PhD.
In my last article, we got a bit comfortable with the notion of errors in surveys. We discussed sampling errors, which occur because we take a random sample rather than a complete census.
If you ever had to admit error, sampling error is the type to admit. Polls admit this sort of error frequently by reporting the margin of error. Margin of error is the sampling error multiplied by a distributional value that can be used to create a confidence interval.
But there are some other types of error that can occur in the survey context that, while influential, are a bit more invisible. They are generally referred to as non-sampling error.
These types of errors are not associated with sample-to-sample variability but to sources like selection biases, frame coverage issues, and measurement errors. These are not the kind of errors you want in your survey.
In theory, it is possible to have an estimator that has little sampling error associated with it. That looks good on the surface, but this estimator may yield poor information due to non-sampling errors.
For example, a high rate of non-response may mean that some participants are opting out and biasing estimates.
Likewise, a scale or set of items on the survey could have known measurement error. They may be imprecise in their measurement of the construct of interest or they may measure that construct better for some populations than others. Again, this can bias estimates.
Frame coverage error occurs when the sampling frame does not quite match the target population. This leads to the sample including individuals who aren’t in the target population, missing individuals who are, or both.
A perspective called the Total Survey Error Framework allows researchers to evaluate estimates on errors that come from sampling and those that don’t. It can be very useful in choosing a sampling design that minimizes errors as a whole.
So when you think about errors and how they might come about in surveys, don’t forget about the non-sampling variety – those that could come as a result of non-response, measurement, or coverage.
Author: Trent Buskirk, PhD.
As it is in history, literature, criminology and many other areas, context is important in statistics. Knowing from where your data comes gives clues about what you can do with that data and what inferences you can make from it.
In survey samples context is critical because it informs you about how the sample was selected and from what population it was selected. (more…)
Author: Trent Buskirk, PhD.
What do you do when you hear the word error? Do you think you made a mistake?
Well in survey statistics, error could imply that things are as they should be. That might be the best news yet–error could mean that things are as they should be.
Let’s break this down a bit more before you think this might be a typo or even worse, an error. (more…)
In this series, we’ve already talked about what a complex sample isn’t; why you’d ever bother with a complex sample; and stratified sampling.
All this is in support of our upcoming workshop: Introduction to the Analysis of Complex Survey Data Using SPSS. If you want to learn a lot more on this topic, check that out.
In this article, we’re going to discuss another common design features of complex samples: cluster sampling.
What is Cluster Sampling?
In cluster sampling, you split the population into groups (clusters), randomly choose a sample of clusters, then measure each individual from each selected cluster.
The most common and obvious example of cluster sampling is when school children are sampled. An example I (more…)
In our last two posts, we explained (1) that every member of a simple random sample had an equal probability of selection and (2) that there are some really good reasons why complex samples can work better, despite being more complex.
Today, we’re going to talk a bit about one complex sampling technique: stratified sampling.
What is Stratified Sampling?
In stratified sampling, the target population is first classified into subgroups or strata. (Grammar note: “strata” is plural for “stratum” just as “data” is plural for “datum.”).
A simple random sample is then selected within every stratum.
That’s it.
For example, let’s say you’re doing a linguistics study within the US. You want to make sure that you have enough (more…)
In our last article, we talked about simple random samples. Simple random samples are, well…simple, but they’re not always optimal or even possible.
Probability samples that don’t meet the assumptions of Simple Random Samples are called Complex Samples.
You’ll also hear the term Complex Survey, which is really just a survey that incorporates some sort of complex sampling design. Because of their size and research goals, surveys are usually* the only type of research study that uses complex samples.
(*but not always. I have seen intervention studies, for example, that used complex sampling).
What is a Complex Sample?
The most defining feature of a complex sample is that sample members do not have equal probability of being selected.
That sounds simple enough. But… (more…)