42439

# dplyr how to lag by group

I have a data frame of orders and receivables with lead times. Can I use dplyr to fill in the receive column according to the groups lead time?

```df <- data.frame(team = c("a","a","a","a", "a", "b", "b", "b", "b", "b"), order = c(2, 4, 3, 5, 6, 7, 8, 5, 4, 5), lead_time = c(3, 3, 3, 3, 3, 2, 2, 2, 2, 2)) >df team order lead_time a 2 3 a 4 3 a 3 3 a 5 3 a 6 3 b 7 2 b 8 2 b 5 2 b 4 2 b 5 2 ```

```dfb <- data.frame(team = c("a","a","a","a", "a", "b", "b", "b", "b", "b"), order = c(2, 4, 3, 5, 6, 7, 8, 5, 4, 5), lead_time = c(3, 3, 3, 3, 3, 2, 2, 2, 2, 2), receive = c(0, 0, 0, 2, 4, 0, 0, 7, 8, 5)) >dfb team order lead_time receive a 2 3 0 a 4 3 0 a 3 3 0 a 5 3 2 a 6 3 4 b 7 2 0 b 8 2 0 b 5 2 7 b 4 2 8 b 5 2 5 ```

I was thinking along these lines but run into an error

```dfc <- df %>% group_by(team) %>% mutate(receive = if_else( row_number() < lead_time, 0, lag(order, n = lead_time))) Error in mutate_impl(.data, dots) : could not convert second argument to an integer. type=SYMSXP, length = 1 ```

Thanks for the help!

This looks like a bug; There might be some unintended mask of the `lag` function between `dplyr` and `stats` package, try this work around:
```df %>% group_by(team) %>% # explicitly specify the source of the lag function here mutate(receive = dplyr::lag(order, n=unique(lead_time), default=0)) #Source: local data frame [10 x 4] #Groups: team [2] # team order lead_time receive # <fctr> <dbl> <dbl> <dbl> #1 a 2 3 0 #2 a 4 3 0 #3 a 3 3 0 #4 a 5 3 2 #5 a 6 3 4 #6 b 7 2 0 #7 b 8 2 0 #8 b 5 2 7 #9 b 4 2 8 #10 b 5 2 5 ```
We can also use `shift` from `data.table`
```library(data.table) setDT(df)[, receive := shift(order, n = lead_time[1], fill=0), by = team] df # team order lead_time receive # 1: a 2 3 0 # 2: a 4 3 0 # 3: a 3 3 0 # 4: a 5 3 2 # 5: a 6 3 4 # 6: b 7 2 0 # 7: b 8 2 0 # 8: b 5 2 7 # 9: b 4 2 8 #10: b 5 2 5 ```