86897

Question:
I am just starting out on learning R and came across a piece of code as follows
vec_1 <- c("a","b", NA, "c","d")
# create a subet of all elements which equal "a"
vec_1[vec_1 == "a"]
The result from this is
## [1] "a" NA
Im just curious, since I am subsetting vec_1
for the value "a", why does NA
also show up in my results?
This is because the result of anything == NA
is NA
. Even NA == NA
is NA
.
Here's the output of vec_1 == "a"
-
[1] TRUE FALSE NA FALSE FALSE
and NA
is not TRUE
or FALSE
so when you subset anything by NA
you get NA
. Check this out -
vec_1[NA]
[1] NA NA NA NA NA
When dealing with NA
, R
tries to provide the most informative answer i.e. T | NA
returns TRUE
because it doesn't matter what NA
is. Here are some more examples -
T | NA
[1] TRUE
F | NA
[1] NA
T & NA
[1] NA
F & NA
[1] FALSE
R has no way to test equality with NA
. In your case you can use %in%
operator -
5 %in% NA
[1] FALSE
"a" %in% NA
[1] FALSE
vec_1[vec_1 %in% "a"]
[1] "a"