Puzzle
I am not sure how do I condition on whether a value in column A appears in column B?
I use the naive way
value A %in% column B
within a mutate function. It workers such that R would search all values in column B. What’s more, it even works within groups. However, I not sure whether it is a luck or a right way to do it?
Example
library(dplyr, warn.conflicts = F)
Data
df <- data.frame(a = c(1,2,3,4,5), b = c(3,2,4,5,6), group = c("a","a", "b", "b", "c"))
print(df)
## a b group
## 1 1 3 a
## 2 2 2 a
## 3 3 4 b
## 4 4 5 b
## 5 5 6 c
without group
df %>% mutate(c = as.integer(a %in% b), d = as.integer(a == b))
## a b group c d
## 1 1 3 a 0 0
## 2 2 2 a 1 1
## 3 3 4 b 1 0
## 4 4 5 b 1 0
## 5 5 6 c 1 0
** by groups**
df %>%
group_by(group) %>%
mutate(c = as.integer(a %in% b), d = as.integer(a == b))
## # A tibble: 5 x 5
## # Groups: group [3]
## a b group c d
## <dbl> <dbl> <fct> <int> <int>
## 1 1.00 3.00 a 0 0
## 2 2.00 2.00 a 1 1
## 3 3.00 4.00 b 0 0
## 4 4.00 5.00 b 1 0
## 5 5.00 6.00 c 0 0