Missing data is a common problem in data analysis. One of the successful approaches is k-Nearest Neighbor (kNN), a simple approach that leverages known information to impute unknown values with a relatively high degree of accuracy. (more…)
Missing data is a common problem in data analysis. One of the successful approaches is k-Nearest Neighbor (kNN), a simple approach that leverages known information to impute unknown values with a relatively high degree of accuracy. (more…)
Even if you’ve never heard the term Generalized Linear Model, you may have run one. It’s a term for a family of models that includes logistic and Poisson regression, among others.
It’s a small leap to generalized linear models, if you already understand linear models. Many, many concepts are the same in both types of models.
But one thing that’s perplexing to many is why generalized linear models have no error term, like linear models do. (more…)
A Gentle Introduction to Random Slopes in Multilevel Modeling
…aka, how to look at cool interaction effects for nested data.
Do the words “random slopes model” or “random coefficients model” send shivers down your spine? These words don’t have to be so ominous. Journal editors are increasingly asking researchers to analyze their data using this particular approach, and for good reason.
Creating a quality scale for a latent construct (a variable that cannot be directly measured with one variable) takes many steps. Structural Equation Modeling is set up well for this task.
One important step in creating scales is making sure the scale measures the latent construct equally well and the same way for different groups of individuals.
Most of the time when we plan a sample size for a data set, it’s based on obtaining reasonable statistical power for a key analysis of that data set. These power calculations figure out how big a sample you need so that a certain width of a confidence interval or p-value will coincide with a scientifically meaningful effect size.
But that’s not the only issue in sample size, and not every statistical analysis uses p-values.
Interpreting the results of logistic regression can be tricky, even for people who are familiar with performing different kinds of statistical analyses. How do we then share these results with non-researchers in a way that makes sense?