There’s a tricky bit about R‘s logical operators, and though it’s described in the help pages that you get when you type, for example,

?"&"

I still fall from it from time to time. (I think it has to do with my rudimentary knowledge and usage of “&&” for if-statements in bash shell scripts.

Below I just go through “&&” and “&”. Obviously, this also applies to “||” and “|”.

Here’s a little demo to remind me and to demonstrate this to newbies:

> 1==1
[1] TRUE
> 1==1 & 1==2
[1] FALSE
> 1==1 && 1==2
[1] FALSE

So far, so good. Now look what happens if we apply this to vectors:

> 1:3==1:3
[1] TRUE TRUE TRUE
> 1:3==c(1,3,3)
[1] TRUE FALSE TRUE
> 1:3==1:3 & 1:3==c(1,3,3)
[1] TRUE FALSE TRUE

This is what you probably want in most cases: element-wise comparison. The “&” here compares each element in the vector “TRUE TRUE TRUE” to the corresponding element in “TRUE FALSE TRUE” and returns a “TRUE” each time they match and a “FALSE” if they don’t. Now look what happens if we use “&&”:

> 1:3==1:3 && 1:3==1:3
[1] TRUE
> 1:3==1:3 && 1:3==c(1,3,3)
[1] TRUE

For “&&”, as the R help page says, “The longer form evaluates left to right examining only the first element of each vector.”

This is quite obvious in this example, but it can get confusing when you use a logical operator to index a vector. Watch this:

> x <- 1:5
> x[x<5]
[1] 1 2 3 4
> x[x<5 & x>2]
[1] 3 4

Fine.

> x[x<5 && x>2]
integer(0)

The reason:

> x>2 & x<5
[1] FALSE FALSE TRUE TRUE FALSE
> x>2 && x<5
[1] FALSE

So in the first case, R "sees"

> x[c(FALSE,FALSE,TRUE,TRUE,FALSE)]

and in the second case, it sees:

> x[FALSE]

As long as the first element compares with "FALSE", you're actually lucky, because the error will be obvious. It's tricky when "&&" evaluates to "TRUE" when it looks at the first element, because some R functions will recycle input if it is too short, and this can lead to things like:

> x <- 1:10
> y <- c(1,3:11)
> x==y
[1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> x[x==1 & y==1]
[1] 1
> x[x==1 && y==1]
[1] 1 2 3 4 5 6 7 8 9 10

So basically, for logical comparison, stick to "&" unless you know you need "&&".

By the way, if you want to see the values that are elements of both x an y, use "%in%":

> x[x%in%y]
[1] 1 3 4 5 6 7 8 9 10
> y[y%in%x]
[1] 1 3 4 5 6 7 8 9 10

But don't do:

> y[x%in%y]
[1] 1 4 5 6 7 8 9 10 11

This indexes y using the logical vector returned by "x%in%y", which is of course:

> x%in%y
[1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE