Sometimes you need to know if your data set contains elements that meet some criterion or a particular set of criteria.
For example, a common data cleaning task is to check if you have missing data (NAs) lurking somewhere in a large data set.
Or you may need to check if you have zeroes or negative numbers, or numbers outside a given range.
In such cases, the any() and all() commands are very helpful. You can use them to interrogate R about the values in your data.
Test for the existence of particular values using the any() command
b <- c(7, 2, 4, 3, -1, -2, 3, 3, 6, 8, 12, 7, 3)
b
[1] 7 2 4 3 -1 -2 3 3 6 8 12 7 3
any(b == -4)
[1] FALSE
any(b < 5)
[1] TRUE
Both commands work on logical vectors. Use any() to check for missing data in a vector or an array
d <- c(3, 2, NA, 5, 6, NA)
d
[1] 3 2 NA 5 6 NA
any(is.na(d))
[1] TRUE
Of course, we can check for non-missing data too.
any(!is.na(d))
[1] TRUE
The any() command is helpful when checking for particular values in large data sets.
You can use the all() command to check whether all elements in a given vector or array satisfy a particular condition. For example, let’s see whether all non-missing values in d are less than 5. Here we note noting that the command is.na() identifies missing data and that the syntax !is.na() identifies non-missing data.
all(d[!is.na(d)] < 5)
[1] FALSE
Now check whether all non-missing elements are less than 7.
all(d[!is.na(d)] < 7)
[1] TRUE
The syntax above looks formidable. However, is.na() identifies missing elements by creating a logical vector whose elements are either TRUE or FALSE.
is.na(d)
[1] FALSE FALSE TRUE FALSE FALSE TRUE
The syntax !is.na(d) gives the opposite logical vector and counts non-missing data. Then, d[!is.na(d)] gives the elements of d that are-non missing. Finally, we apply the all() command, and include the condition that all elements are less than 7.
That wasn’t so hard! In our next blog post we’ll learn about re-coding values in R.
About the Author: David Lillis has taught R to many researchers and statisticians. His company, Sigma Statistics and Research Limited, provides both on-line instruction and face-to-face workshops on R, and coding services in R. David holds a doctorate in applied statistics.
See our full R Tutorial Series and other blog posts regarding R programming.
Leave a Reply