Program to obtain frequency matrix of categorical data


I am working on data that contains more than 300 categorical features that I have factored into 0s and 1s. Now, i need to create a matrix of the features to with frequency of joint occurrence in each cell.

In the end , I am looking to create a heatmap of this frequency matrix.

So, my dataframe in R looks like this:

id cat1 cat2 cat3 cat4 156 0 0 1 1 465 1 1 1 0 573 0 1 1 0

The output I want is:

cat1 cat2 cat3 ... cat1 0 1 0 cat2 1 0 2 cat3 1 2 0 . .

where each cell value denotes the number of times the two categorical variables have appeared <em>together</em>.


We can use outer

#Since we have only 0's and 1's in column we can directly use & fun <- function(x, y) sum(df[, x] & df[, y]) #Get all the cat columns n <- seq_along(df)[-1] #Apply function to every combination of columns mat <- outer(n, n, Vectorize(fun)) #Turn diagonals to 0 diag(mat) <- 0 #Assign rownames and column names dimnames(mat) <- list(names(df)[n], names(df[n])) # cat1 cat2 cat3 cat4 #cat1 0 1 1 0 #cat2 1 0 2 0 #cat3 1 2 0 1 #cat4 0 0 1 0

we can use table with crossprod from base R

i1 <- as.logical(unlist(df1[-1])) out <- crossprod(table(df1$id[row(df1[-1])][i1], names(df1)[-1][col(df1[-1])]. [i1])) diag(out) <- 0 out # cat1 cat2 cat3 cat4 # cat1 0 1 1 0 # cat2 1 0 2 0 # cat3 1 2 0 1 # cat4 0 0 1 0 <h3>data</h3> df1 <- structure(list(id = c(156L, 465L, 573L), cat1 = c(0L, 1L, 0L), cat2 = c(0L, 1L, 1L), cat3 = c(1L, 1L, 1L), cat4 = c(1L, 0L, 0L)), class = "data.frame", row.names = c(NA, -3L))



  • Jest mock the same function twice with different arguments
  • what is deep_ping [closed]
  • jQuery append some weird string into the data submitted
  • Why does the program give “illegal start of type” error?
  • SOAP-ERROR: Encoding: Object has no property
  • Need To Compile Keras Model Before `model.evaluate()`
  • Div with background image and no content not displaying
  • issue in encoding a string with percent escaping iphone
  • pan gesture issue, slight movement within UIButton detection
  • There is another way to optimize this bubble-sort?
  • VBA Excel accessing index with named range [closed]
  • multiple dialog box within same page
  • Swift CoreData: Unable to section tableView using sectionNameKeyPath with custom function
  • Using a taglet with Javadoc in Netbeans
  • spring / hibernate - filter by current user id
  • How to configure server to allow large file downloads?
  • Styling Ribbon from the RibbonControlsLibrary
  • How do I get data back from Paypal so I can alter my MySQL database accordingly?
  • Regular expression breakpoint in GDB
  • How to link address model to views
  • PHP users local time
  • Expression.Call GroupBy then Select and Count()?
  • android : speech recognition what are the technologies available
  • MFMailComposer send email without presenting view
  • $this->a->b->c->d calling methods from a superclass in php
  • Bundling python(“.py”)files along with java class files for a web application
  • .Net core Hosted Services guaranteed to complete
  • Create an Office365 mailbox from within C# Web API method
  • Request Access Token in Postman for Azure Function App protected by Azure AD B2C
  • Separating definition/instantiation of template classes without 'extern'
  • How convert html to BBcode in C#
  • Error handeling in antlr 3.0
  • calling IO Operations from thread in ruby c extension will cause ruby to hang
  • Excel VBA : conditional formatting of sheet1 cells from sheet2 values in excel 2007
  • Why does Rails 3 think xE2x80x89 means â x80 x89
  • Angular FormGroup won't update it's value immediately after patchValue or setValue
  • WPF custom control and direct content support
  • media foundation H264 decoder not working properly
  • Access to a Matlab gui from the web
  • ReferenceError: TextEncoder is not defined