Last time we created two variables and added a best-fit regression line to our plot of the variables. Here are the two variables again. (more…)
Last time we created two variables and added a best-fit regression line to our plot of the variables. Here are the two variables again. (more…)
Today let’s re-create two variables and see how to plot them and include a regression line. We take height to be a variable that describes the heights (in cm) of ten people. (more…)
I get this question a lot, and it’s difficult to answer at first glance–it depends too much on your particular situation.
There are really three parts to the approach to building a model: the strategy, the technique to implement that strategy, and the decision criteria used within the technique. (more…)
A normal probability plot is extremely useful for testing normality assumptions. It’s more precise than a histogram, which can’t pick up subtle deviations, and doesn’t suffer from too much or too little power, as do tests of normality.
There are two versions of normal probability plots: Q-Q and P-P. I’ll start with the Q-Q. (more…)
In a statistical model–any statistical model–there is generally one way that a predictor X and a response Y can relate:
This relationship can take on different forms, of course, like a line or a curve, but there’s really only one relationship here to measure.
Usually the point is to model the predictive ability, the effect, of X on Y.
In other words, there is a clear response variable*, although not necessarily a causal relationship. We could have switched the direction of the arrow to indicate that Y predicts X or used a two-headed arrow to show a correlation, with no direction, but that’s a whole other story.
For our purposes, Y is the response variable and X the predictor.
But a third variable–another predictor–can relate to X and Y in a number of different ways. How this predictor relates to X and Y changes how we interpret the relationship between X and Y. (more…)
There are two oft-cited assumptions for Analysis of Covariance (ANCOVA), which is used to assess the effect of a categorical independent variable on a numerical dependent variable while controlling for a numerical covariate:
1. The independent variable and the covariate are independent of each other.
2. There is no interaction between independent variable and the covariate.
In a previous post, I showed a detailed example for an observational study where the first assumption is irrelevant, but I have gotten a number of questions about the second.
So what does it mean, and what should you do, if you find an interaction between the categorical IV and the continuous covariate? (more…)