14177

subtracting two data frames

suppose I have two data frames (DF1 & DF2) and both contain the (x,y) coordinates. I would like to extract the pair of (x,y) that is in DF1 but not DF2. Example:

DF1<-data.frame(x=1:3,y=4:6,t=10:12) DF2<-data.frame(x=3:5,y=6:8,s=1:3)

I want to get

DF_new<-data.frame(x=1:2,y=4:5,t=10:11).

What should I do for much larger data sets? Thanks!

Answer1:

For very large data sets you may be interested in data.table:

library(data.table) DF1<-data.frame(x=1:3,y=4:6,t=10:12) DF2<-data.frame(x=3:5,y=6:8,s=1:3) library(data.table) DF1 <- data.table(DF1, key = c("x", "y")) DF2 <- data.table(DF2, key = c("x", "y")) DF1[complete.cases(DF1[DF2])] # maybe you want this? DF2[DF1] DF1[!DF2] # or maybe you want this? DF2[!DF1]

Answer2:

Seems like using merge is a good candidate here:

merge(DF1,DF2) x y t s 1 3 6 12 1

Answer3:

library(tidyverse) DF1<-data.frame(x=1:3,y=4:6,t=10:12) DF2<-data.frame(x=3:5,y=6:8,s=1:3) anti_join(DF1, DF2) #> Joining, by = c("x", "y") #> x y t #> 1 1 4 10 #> 2 2 5 11

Recommend

  • Removing duplicate rows with ddply
  • sparklyr change all column names spark dataframe
  • JavaScript - Can we set javascript objects in cookies?
  • Pandas - find nearest dates between two DataFrames without loop
  • Pandas v0.20 returns NotImplemented when multiplying dataframe columns
  • Deploying pre-encrypted configuration files to a production environment
  • Is an if-let or a normal if condition better?
  • Each Radiobutton for each form or 1 Form for all radiobuttons?
  • What does a hyphen at end of a term mean
  • Use allowDiskUse in criteria query with Grails and the MongoDB plugin?
  • How to extract a number from a string [duplicate]
  • How to get value from merged-cells in Excel document using OpenXML and C#
  • Custom Nav Title offset ios 11
  • How to initialize context? [closed]
  • How to use the resource module to measure the running time of a function?
  • Granting permissions to Azure Active Directory Web Application automatically
  • Use sed with regex and (
  • PayPal API Listener Website Payments Standard URI
  • Google Places API - Find a company's CID and LRD
  • Can I have a variable number of URI parameters or key-value pairs in Laravel 4?
  • In matplotlib, how do you change the fontsize of a single figure?
  • Python ImageIO Gif Set Delay Between Frames
  • as3-flash: any way to access all the instances placed in different frames from document class?
  • Entity Framework Code First TPC Inheritance Self-Referencing Child Class
  • wxPython: displaying multiple widgets in same frame
  • Does it make sense to call System.gc() and Thread.sleep() when working on Bitmaps?
  • Caching attributes in superclass
  • Converting a WriteableBitmap image ToArray in UWP
  • R - Combining Columns to String Based on Logical Match
  • Update CALayer sublayers immediately
  • KeystoneJS: Relationships in Admin UI not updating
  • trying to dynamically update Highchart column chart but series undefined
  • embed rChart in Markdown
  • How to get NHibernate ISession to cache entity not retrieved by primary key
  • costura.fody for a dll that references another dll
  • Observable and ngFor in Angular 2
  • How can I use `wmic` in a Windows PE script?
  • UserPrincipal.Current returns apppool on IIS
  • Unable to use reactive element in my shiny app
  • java string with new operator and a literal