Sure. One of the big advantages of multiple imputation is that you can use it for any analysis.
It’s one of the reasons big data libraries use it–no matter how researchers are using the data, the missing data is handled the same, and handled well.
I say this with two caveats. (more…)
Do you find quizzes irresistible? I do.
Here’s a little quiz about working with missing data:
True or False?
1. Imputation is really just making up data to artificially inflate results. It’s better to just drop cases with missing data than to impute.
2. I can just impute the mean for any missing data. It won’t affect results, and improves power.
3. Multiple Imputation is fine for the predictor variables in a statistical model, but not for the response variable.
4. Multiple Imputation is always the best way to deal with missing data.
5. When imputing, it’s important that the imputations be plausible data points.
6. Missing data isn’t really a problem if I’m just doing simple statistics, like chi-squares and t-tests.
7. The worst thing that missing data does is lower sample size and reduce power.
Answers: (more…)
In my last post, I gave a little quiz about missing data. This post has the answers.
If you want to try it yourself before you see the answers, go here. (It’s a short quiz, but if you’re like me, you find testing yourself irresistible).
True or False?
1. Imputation is really just making up data to artificially inflate results. It’s better to just drop cases with missing data than to impute. (more…)
Missing Data, and multiple imputation specifically, is one area of statistics that is changing rapidly. Research is still ongoing, and each year new findings on best practices and new techniques in software appear.
The downside for researchers is that some of the recommendations missing data statisticians were making even five years ago have changed.
Remember that there are three goals of multiple imputation, or any missing data technique: Unbiased parameter estimates in the final analysis (more…)
Most Multiple Imputation methods assume multivariate normality, so a common question is how to impute missing values from categorical variables.
Paul Allison, one of my favorite authors of statistical information for researchers, did a study that showed that the most common method actually gives worse results that listwise deletion. (Did I mention I’ve used it myself?) (more…)
In choosing an approach to missing data, there are a number of things to consider. But you need to keep in mind what you’re aiming for before you can even consider which approach to take.
There are three criteria we’re aiming for with any missing data technique:
1. Unbiased parameter estimates: Whether you’re estimating means, regressions, or odds ratios, you want your parameter estimates to be accurate representations of the actual population parameters. In statistical terms, that means the estimates should be unbiased. If all the (more…)