27271

Word cloud in R with two separate values

Question:

As someone new to R, I am working at producing a word cloud that shows two variables: frequency and rating. Using a generic table, I am looking to display the hypothetical number of colleges (font = big to small in number) by state and the hypothetical average college rating

<ul><li>1 = green (good), </li> <li>3 = yellow (average), </li> <li>5 = red (bad)</li> </ul>

I am able to to create this cloud that depicts fonts = number of colleges, but cannot tie in the rating to the third column. Here is my generic table:

State Colleges Rating Alabama 220 1 Alaska 100 3 Arizona 50 5 Arkansas 275 1 California 155 3 Colorado 68 5 Connecticut 235 1 Delaware 189 3 Florida 32 5 Georgia 219 1 Hawaii 117 3 Idaho 63 5 Illinois 264 1 Indiana 167 3 Iowa 76 5 Kansas 287 1 Kentucky 178 3 Louisiana 67 5 Maine 246 1 Maryland 169 3 Massachusetts 46 5 Michigan 225 1 Minnesota 132 3 Mississippi 23 5 Missouri 219 1 Montana 194 3 Nebraska 97 5

Below is my very simple script:

library(wordcloud) library(rcolorbrewer) data <- read.csv("wordcloud.csv", header = T) pal <- brewer.pal(9, "RdYlGn") wordcloud(data$State, data$Colleges, scale = c(4,1), colors = pal, rot.per=.5)

The above script allows for text size to reflect number of colleges, but I am not able to link the color ramp of 1 = green (good) to 3 = yellow (average) to 5 = red (bad). Any suggestions are greatly appreciated.

Answer1:

You can assign the colours manually and add ordered.colors=T

wordcloud(data$State, data$Colleges, scale = c(4,1), colors = rep(c("green", "yellow", "red"), 9), rot.per=.5, ordered.colors=T)

<a href="https://i.stack.imgur.com/Kchce.png" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/Kchce.png" data-original="https://i.stack.imgur.com/Kchce.png" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>

Answer2:

There's also the possibility to plot a comparison cloud in such cases.

For this, we first convert the data from long to wide format:

library(reshape2) df1 <- dcast(df1,State + Colleges ~ Rating, value.var = "Colleges")

Then we perform a few standard operations to prepare a suitable matrix:

rownames(df1) <- df1[,1] #use name of States as row names df1 <- df1[,-c(1,2)] #remove "States" and "Colleges" column df1[is.na(df1)] <- 0 #set NA values to zero df1 <- as.matrix(df1) #convert into matrix colnames(df1) <- c("good", "average", "bad")

Finally, we can plot the comparison cloud and assign colors to the groups as we wish:

library(wordcloud) comparison.cloud(df1,max.words=Inf,random.order=FALSE, scale = c(4,.5), title.size = 1, colors=c("green","orange","red"))

<a href="https://i.stack.imgur.com/P7rcR.png" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/P7rcR.png" data-original="https://i.stack.imgur.com/P7rcR.png" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>

<strong>data</strong>

df1 <- structure(list(State = structure(1:27, .Label = c("Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi", "Missouri", "Montana", "Nebraska"), class = "factor"), Colleges = c(220L, 100L, 50L, 275L, 155L, 68L, 235L, 189L, 32L, 219L, 117L, 63L, 264L, 167L, 76L, 287L, 178L, 67L, 246L, 169L, 46L, 225L, 132L, 23L, 219L, 194L, 97L), Rating = c(1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L)), .Names = c("State", "Colleges", "Rating"), class = "data.frame", row.names = c(NA, -27L))

Recommend

  • If () statement in R
  • Java Regex looking a combination of words in any order
  • Adjusting UTC date/time to different time zones by reference in `lubridate`
  • How to restrict app usability to a certain geographical area ANDROID
  • PHP Script for States to Abbreviations Not Working Correctly
  • D3.js x-axis time scale
  • Can anyone help me identify the Exception in thread “AWT-EventQueue-0” java.lang.NullPointerExceptio
  • Pull to refresh in Angular Js
  • How to identify mirrored duplicates of rows in R
  • geoXML3 add custom icon for marker
  • Selenium 2 and html tables
  • dojo.requireIf does not allow local variables
  • Custom AOSP Keyboard Error: DictionaryProvider ClassNotFoundException
  • Remove non-U.S. state data
  • How to convert this 'for' loop to a vector solution
  • Git push to GitHub failing - seems to be trying to use HTTP?
  • RegularExpressionValidator With DropDownList(asp.net)
  • Asp calendar with datasource repeats the label
  • Why toDataURL Does Not Get Canvas Content on Mobile?
  • ERROR: could not find driver - Using PDO with MS Access database
  • xp_regread() returned error 5, 'Access is denied.'
  • PowerShell IComparable with subclasses
  • Writing dataframe to postgres database
  • Appending to existing SQLite table when addition has fewer columns, without reading database into R
  • Drag and Drop the cross sign within box in the grid using plain javascript?
  • Modifying the AJAX PHP database example
  • how to check if a field is not unique
  • Reading Excel files in a locale independent way
  • nodejs tls session id
  • Can't connect using mysql_connect to Database in ipage.com hosting?
  • PHP MySQL generating unique random number
  • geom_map “map_id” reference issue
  • Parse returned C# list in AJAX success function
  • ssh2_scp_send() using php corrupts pdf
  • PHP PDF generation problem
  • Align Excel cell to center VB - xlCenter is not declared
  • Imports in __init__.py and `import as` statement
  • Escaping single quotes in JDBC with MySql
  • Adding Parent and Child Nodes in TreeView from Sql Server 2008
  • Getting last autonumber in access