One question that seems to come up pretty often is:
Well, let’s start with how they’re the same:
Both are types of generalized linear models. This means they have this form:
One question that seems to come up pretty often is:
Well, let’s start with how they’re the same:
Both are types of generalized linear models. This means they have this form:
An incredibly useful tool in evaluating and comparing predictive models is the ROC curve.
Its name is indeed strange. ROC stands for Receiver Operating Characteristic. Its origin is from sonar back in the 1940s. ROCs were used to measure how well a sonar signal (e.g., from an enemy submarine) could be detected from noise (a school of fish).
ROC curves are a nice way to see how any predictive model can distinguish between the true positives and negatives. (more…)
Relative Risk and Odds Ratios are often confused despite being unique concepts. Why?
Well, both measure association between a binary outcome variable and a continuous or binary predictor variable. (more…)
Effect size statistics are expected by many journal editors these days.
If you’re running an ANOVA, t-test, or linear regression model, it’s pretty straightforward which ones to report.
Things get trickier, though, once you venture into other types of models.
Ordinary Least Squares regression provides linear models of continuous variables. However, much data of interest to statisticians and researchers are not continuous and so other methods must be used to create useful predictive models.
The glm() command is designed to perform generalized linear models (regressions) on binary outcome data, count data, probability data, proportion data and many other data types.
In this blog post, we explore the use of R’s glm() command on one such data type. Let’s take a look at a simple example where we model binary data.
One great thing about logistic regression, at least for those of us who are trying to learn how to use it, is that the predictor variables work exactly the same way as they do in linear regression.
Dummy coding, interactions, quadratic terms–they all work the same way.
In pretty much every regression procedure in every stat software, the default way to code categorical variables is with dummy coding.
All dummy coding means is recoding the original categorical variable into a set of binary variables that have values of one and zero. You may find it helpful to (more…)