In Part 10, let’s look at the aggregate command for creating summary tables using R.
You may have a complex data set that includes categorical variables of several levels, and you may wish to create summary tables for each level of the categorical variable.
For example, your data set may include the variable Gender, a two-level categorical variable with levels Male and Female. Your data set may include other categorical variables such as Ethnicity, Hair Colour, the Treatments received by patients in a medical study, or the number of cylinders in motor vehicles.
In any case, you may wish to produce summary statistics for each level of the categorical variable. This is where the aggregate command is so helpful. (more…)
In Part 9, let’s look at sub-setting in R. I want to show you two approaches.
Let’s provide summary tables on the following data set of tourists from different nations, their gender and numbers of children. Copy and paste the following array into R. (more…)
Let’s look at some basic commands in R.
Set up the following vector by cutting and pasting from this document:
a <- c(3,-7,-3,-9,3,-1,2,-12, -14)
b <- c(3,7,-5, 1, 5,-6,-9,16, -8)
d <- c(1,2,3,4,5,6,7,8,9)
Now figure out what each of the following commands do. You should not need me to explain each command, but I will explain a few. (more…)
In Part 7, let’s look at further plotting in R. Try entering the following three commands together (the semi-colon allows you to place several commands on the same line).
Let’s take an example with two variables and enhance it.
X <- c(3, 4, 6, 6, 7, 8, 9, 12)
B1 <- c(4, 5, 6, 7, 17, 18, 19, 22)
B2 <- c(3, 5, 8, 10, 19, 21, 22, 26)
(more…)
In Part 6, let’s look at basic plotting in R. Try entering the following three commands together (the semi-colon allows you to place several commands on the same line).
x <- seq(-4, 4, 0.2) ; y <- 2*x^2 + 4*x - 7
plot(x, y)
(more…)
In Part 3 and Part 4 we used the lm() command to perform least squares regressions. We saw how to check for non-linearity in our data by fitting polynomial models and checking whether they fit the data better than a linear model. Now let’s see how to fit an exponential model in R.
As before, we will use a data set of counts (atomic disintegration events that take place within a radiation source), taken with a Geiger counter at a nuclear plant.
The counts were registered over a 30 second period for a short-lived, man-made radioactive compound. We read in the data and subtract the background count of 623.4 counts per second in order to obtain
(more…)