SPSS has a nice little feature for adding and averaging variables with missing data that many people don’t know about.
It allows you to add or average variables that have some missing data, while specifying how many are allowed to be missing. (more…)
You might be surprised to hear that not only can linear regression fit lines between a response variable Y and one or more predictor variables, X, it can fit curves too. There are many ways to do this, but the simplest is by adding a polynomial term.
So what is a polynomial term and how do you know you need one?
The linear parameters in a regression model
A linear regression model has a few key parameters. These include the intercept coefficient, the slope coefficient, and the residual variance.
That intercept defines the height of the regression line. It does so by measuring the height of the line at one specific point: when all X = 0.
The slope defines how much Y differs, on average, for each one unit difference in X. In other words, it measures the constant relationship between X and Y. Yes, there can be multiple Xs and each one has its own slope.
A polynomial term–a quadratic (squared) or cubic (cubed) term turns a linear regression model into a curve.
(more…)
No matter what statistical model you’re running, you need to go through the same steps. The order and the specifics of how you do each step will differ depending on the data and the type of model you use.
These steps are in 4 phases. Most people think of only the third as modeling. But the phases before this one are fundamental to making the modeling go well. It will be much, much easier, more accurate, and more efficient if you don’t skip them.
And there is no point in running the model if you skip phase 4.
If you think of them all as part of the analysis, the modeling process will be faster, easier, and make more sense.
Phase 1: Define and Design
In the first 5 steps of running the model, the object is clarity. You want to make everything as clear as possible to yourself. The more clear things are at this point, the smoother everything will be. (more…)
A well-fitting regression model results in predicted values close to the observed data values. The mean model, which uses the mean for every predicted value, generally would be used if there were no useful predictor variables. The fit of a proposed regression model should therefore be better than the fit of the mean model. But how do you measure that model fit?
(more…)
One of those tricky, but necessary, concepts in statistics is the difference between crossed and nested factors.
As a reminder, a factor is any categorical independent variable. In experiments, or any randomized designs, these factors are often manipulated. Experimental manipulations (like Treatment vs. Control) are factors.
Observational categorical predictors, such as gender, time point, poverty status, etc., are also factors. Whether the factor is observational or manipulated won’t affect the analysis, but it will affect the conclusions you draw from the results.
(more…)
There are not a lot of statistical methods designed just to analyze ordinal variables.
But that doesn’t mean that you’re stuck with few options. There are more than you’d think.
Some are better than others, but it depends on the situation and research questions.
Here are five options when your dependent variable is ordinal.
(more…)