35027

R: Function “diff” over various groups

Question:

While searching for a solution to my problem I found this thread: <a href="https://stackoverflow.com/questions/8254508/function-diff-over-various-groups-in-r" rel="nofollow">Function "diff" over various groups in R</a>. I've got a very similar question so I'll just work with the example there.

This is what my desired output should look like:

name class year diff 1 a c1 2009 NA 2 a c1 2010 67 3 b c1 2009 NA 4 b c1 2010 20

I have two variables which form subgroups - class and name. So I want to compare only the values which have the same name and class. I also want to have the differences from 2009 to 2010. If there is no 2008, diff 2009 should return NA (since it can't calculate a difference).

I'm sure it works very similarly to the other thread but I just can't make it work. I used this code too (and simply solved the ascending year by sorting the data differently), but somehow R still manages to calculate a difference and does not return NA.

ddply(df, .(class, name), summarize, year=head(year, -1), value=diff(value))

Answer1:

Using dplyr

df %>% filter(year!=2008)%>% arrange(name, class, year)%>% group_by(class, name)%>% mutate(diff=c(NA,diff(value))) # Source: local data frame [12 x 5] # Groups: class, name # name class year value diff # 1 a c1 2009 33 NA # 2 a c1 2010 100 67 # 3 a c2 2009 80 NA # 4 a c2 2010 90 10 # 5 a c3 2009 90 NA # 6 a c3 2010 100 10 # 7 b c1 2009 80 NA # 8 b c1 2010 90 10 # 9 b c2 2009 90 NA # 10 b c2 2010 100 10 # 11 b c3 2009 80 NA # 12 b c3 2010 99 19 <h3>Update:</h3> With relative difference df %>% filter(year!=2008)%>% arrange(name, class, year)%>% group_by(class, name)%>% mutate(diff1=c(NA,diff(value)), rel_diff=round(diff1/value[row_number()-1],2))

Answer2:

Using the data set form the other post, I would do something like

library(data.table) df <- df[df$year != 2008, ] setkey(setDT(df), class, name, year) df[, diff := lapply(.SD, function(x) c(NA, diff(x))), .SDcols = "value", by = list(class, name)]

Which returns

df # name class year value diff # 1: a c1 2009 33 NA # 2: a c1 2010 100 67 # 3: b c1 2009 80 NA # 4: b c1 2010 90 10 # 5: a c2 2009 80 NA # 6: a c2 2010 90 10 # 7: b c2 2009 90 NA # 8: b c2 2010 100 10 # 9: a c3 2009 90 NA #10: a c3 2010 100 10 #11: b c3 2009 80 NA #12: b c3 2010 99 19

Recommend

  • Getting the Shape Group Name Inside a Shape
  • Object Oriented Design - C#
  • R - Number of days since last occurrence [duplicate]
  • Is there a way to pass different vertical lines to each subplot when using pandas histogram with “by
  • Unsupported Operation. A document processed by the JRC engine cannot be opened in the C stack.
  • Delphi & ADO: datetime to string conversion
  • Need to add username and password to SOAP header in VB.NET Web Service Client
  • How to make Java compiler generate line numbers in compiled code
  • Performance difference between accessing local and class member variables
  • How to get or calculate size of Azure File/Share or Service
  • How to model a mixture of finite components from different parametric families with JAGS?
  • SQL Server re-calculate or not?
  • Comparing user's facebook/twitter friends to site's users in Python/Django
  • Receive mouse move even cursor is outside control
  • runtime-check whether an instance (Base*) override a parent function (Base::f())
  • Create function that can pass a parameter without making a new component
  • Xaml, wpf image position and crop issue
  • Alamofire and Reachability.swift not working on xCode8-beta5
  • How can we prepend rows to a react native list-view?
  • Where can I find tomesh.c?
  • Tell Git to stop prompting me for conflicts when none really exist?
  • Debugging VB6 Code From Visual Studio 2010
  • How to extract text from Word files using C#?
  • Convert array of 8 bytes to signed long in C++
  • ORA-29908: missing primary invocation for ancillary operator
  • jQuery tmpl and DataLink beta
  • RestKit - RKRequestDelegate does not exist
  • SQL merge duplicate rows and join values that are different
  • Proper way to use connect-multiparty with express.js?
  • How to set the response of a form post action to a iframe source?
  • Understanding cpu registers
  • Are Kotlin's Float, Int etc optimised to built-in types in the JVM? [duplicate]
  • Django query for large number of relationships
  • Does armcc optimizes non-volatile variables with -O0?
  • Recursive/Hierarchical Query Using Postgres
  • Running Map reduces the dimensions of the matrices
  • Why is Django giving me: 'first_name' is an invalid keyword argument for this function?
  • How can I use `wmic` in a Windows PE script?
  • Conditional In-Line CSS for IE and Others?
  • How to push additional view controllers onto NavigationController but keep the TabBar?