The great majority of all regression modeling explores and tests the association between independent and dependent variables. We are not able to claim the independent variable(s) has a causal relationship with the dependent variable. There are five specific model types that allow us to test for causality. Difference in differences models are one of the five.
(more…)
Have you ever wondered why there are so many different types of experimental designs, and how a researcher would go about choosing among them to best address their research questions? (more…)
Post-hoc tests, pairwise or other linear contrasts, are typical in an analysis of variance (ANOVA) setting to understand which group means differ. They incorporate p-value adjustments to avoid concluding that group means differ when they actually do not. There are several adjustments that can be considered for conducting multiple post-hoc tests, including single-step and stepwise adjustments. (more…)
If you’ve been doing data analysis for very long, you’ve certainly come across terms, concepts, and processes of matrix algebra. Not just matrices, but:
- Matrix addition and multiplication
- Traces and determinants
- Eigenvalues and Eigenvectors
- Inverting and transposing
- Positive and negative definite
(more…)

For nearly a hundred years the concept of “statistical significance” has been fundamental to statistics and to science. And for nearly that long, it has been controversial and misused as well. (more…)

Missing data is a common problem in data analysis. One of the successful approaches is k-Nearest Neighbor (kNN), a simple approach that leverages known information to impute unknown values with a relatively high degree of accuracy. (more…)