89547 Finding Cumulative Sum In R Using Conditions

I need to create a new variable with the sum of the past three years' amounts for each ID.

If there are not three years' worth of data, there should be an 'NA'.

As an example:

ID YEAR AMOUNT 1 2010 5 1 2011 2 1 2012 4 1 2013 1 1 2014 3 2 2013 4 2 2014 6 2 2015 9 3 2012 4 3 2013 7 3 2014 2 3 2015 3

Here's what the result should be:

ID YEAR AMOUNT THREE_YR 1 2010 5 NA 1 2011 2 NA 1 2012 4 11 1 2013 1 7 1 2014 3 8 2 2013 4 NA 2 2014 6 NA 2 2015 9 19 3 2012 4 NA 3 2013 7 NA 3 2014 2 13 3 2015 3 12

How would I do this? Thanks!

We can use functions from dplyr and zoo. dt2 is the final output.

# Create example data frame dt <- read.table(text = "ID YEAR AMOUNT 1 2010 5 1 2011 2 1 2012 4 1 2013 1 1 2014 3 2 2013 4 2 2014 6 2 2015 9 3 2012 4 3 2013 7 3 2014 2 3 2015 3", header = TRUE, stringsAsFactors = FALSE) # Load packages library(dplyr) library(zoo) # Process the data dt2 <- dt %>% group_by(ID) %>% mutate(THREE_YR = rollsum(AMOUNT, k = 3, fill = NA, align = "right"))

Update: ID groups with less than 3 records.

The OP asked what to do if there are IDs with only one or two rows. Honestly, I did not find a good way to solve this. The only thing I can think of is dividing the original data frame to two groups, apply the rollsum to the group with all records larger than or equal to three. After that, combine all groups.

# Create example data frame dt <- read.table(text = "ID YEAR AMOUNT 1 2010 5 1 2011 2 1 2012 4 1 2013 1 1 2014 3 2 2013 4 3 2012 4 3 2013 7 3 2014 2 3 2015 3", header = TRUE, stringsAsFactors = FALSE) # Load packages library(dplyr) library(zoo) # Process the data dt2 <- dt %>% group_by(ID) %>% filter(n() >= 3) %>% mutate(THREE_YR = rollsum(AMOUNT, k = 3, fill = NA, align = "right")) %>% bind_rows(dt %>% group_by(ID) %>% filter(n() < 3)) %>% arrange(ID, YEAR)