Stacked barplot using R base: how to add values inside each stacked bar


I found a plot in a stats book, which I want to reproduce with the base package.

The plot looks like this:

<a href="https://i.stack.imgur.com/etIoq.png" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/etIoq.png" data-original="https://i.stack.imgur.com/etIoq.png" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>

So far I have the plot, but I have problems to add a centred labels to each part of the bar.

My code looks like this:

data <- sample( 5, 10 , replace = TRUE ) colors <- c('yellow','violet','green','pink','red') relative.frequencies <- as.matrix( prop.table( table( data ) ) ) bc <- barplot( relative.frequencies, horiz = TRUE, axes = FALSE, col = colors )


For your given example, we can do (<strong>all readers can skip this part and jump to the next</strong>):

set.seed(0) ## `set.seed` for reproducibility dat <- sample( 5, 10 , replace = TRUE ) colors <- c('yellow','violet','green','pink') h <- as.matrix( prop.table( table( dat ) ) ) ## compute x-location of the centre of each bar H <- apply(h, 2L, cumsum) - h / 2 ## add text to barplot bc <- barplot(h, horiz = TRUE, axes = FALSE, col = colors ) text(H, bc, labels = paste0(100 * h, "%"))

<a href="https://i.stack.imgur.com/CwbiW.jpg" rel="nofollow"><img alt="strip" class="b-lazy" data-src="https://i.stack.imgur.com/CwbiW.jpg" data-original="https://i.stack.imgur.com/CwbiW.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>

<hr />

For all readers

<strong>I will now construct a comprehensive example for you to digest the idea.</strong>

<strong>Step 1: generate a toy matrix of percentage for experiment</strong>

## a function to generate `n * p` matrix `h`, with `h > 0` and `colSums(h) = 1` sim <- function (n, p) { set.seed(0) ## a positive random matrix of 4 rows and 3 columns h <- matrix(runif(n * p), nrow = n) ## rescale columns of `h` so that `colSums(h)` is 1 h <- h / rep(colSums(h), each = n) ## For neatness we round `h` up to 2 decimals h <- round(h, 2L) ## but then `colSums(h)` is not 1 again ## no worry, we simply reset the last row: h[n, ] <- 1 - colSums(h[-n, ]) ## now return this good toy matrix h } h <- sim(4, 3) # [,1] [,2] [,3] #[1,] 0.43 0.31 0.42 #[2,] 0.13 0.07 0.40 #[3,] 0.18 0.30 0.04 #[4,] 0.26 0.32 0.14

<strong>Step 2: understand a stacked bar-chart and get "mid-height" of each stacked bar</strong>

For stacked bar-chart, the height of the bar is the cumulative sum of each column of h:

H <- apply(h, 2L, cumsum) # [,1] [,2] [,3] #[1,] 0.43 0.31 0.42 #[2,] 0.56 0.38 0.82 #[3,] 0.74 0.68 0.86 #[4,] 1.00 1.00 1.00

We now shift back h / 2 to get the mid / centre of each stacked bar:

H <- H - h / 2 # [,1] [,2] [,3] #[1,] 0.215 0.155 0.21 #[2,] 0.495 0.345 0.62 #[3,] 0.650 0.530 0.84 #[4,] 0.870 0.840 0.93

<strong>Step 3: producing a bar-chart with filled numbers</strong>

For a vertical bar-chart, H above gives the y coordinate of the centre of each stacked bar. The x coordinate is returned by barplot (invisibly). Be aware, that we need to <strong>replicate</strong> each of x's element nrow(H) times when using text:

x <- barplot(h, col = 1 + 1:nrow(h), yaxt = "n") text(rep(x, each = nrow(H)), H, labels = paste0(100 * h, "%"))

<a href="https://i.stack.imgur.com/KGWQf.jpg" rel="nofollow"><img alt="vertical barchart" class="b-lazy" data-src="https://i.stack.imgur.com/KGWQf.jpg" data-original="https://i.stack.imgur.com/KGWQf.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>

For a horizontal bar-chart, H above gives the x coordinate of the centre of each stacked bar. The y coordinate is returned by barplot (invisibly). Be aware, that we need to <strong>replicate</strong> each of y's element nrow(H) times when using text:

y <- barplot(h, col = 1 + 1:nrow(h), xaxt = "n", horiz = TRUE) text(H, rep(y, each = nrow(H)), labels = paste0(100 * h, "%"))

<a href="https://i.stack.imgur.com/L5S9q.jpg" rel="nofollow"><img alt="Horizontal bar-chart" class="b-lazy" data-src="https://i.stack.imgur.com/L5S9q.jpg" data-original="https://i.stack.imgur.com/L5S9q.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>


Here is another solution using mapply:

invisible(mapply(function(k, l) text(x = (k - l/2), y = bc, labels = paste0(l*100, "%"), cex = 1.5), cumsum(relative.frequencies), relative.frequencies))

mapply is a multivariate version of sapply. In this case, it takes two inputs: cumsum(relative.frequencies) and relative.frequencies and applies the text() function using those two vectors. x = is the coordinates of the labels which takes each cumulative sum minus half of each corresponding relative.frequencies. relative.frequencies is then used again as the labels to be plotted.

The invisible() function suppresses the printing of outputs into the console.

<a href="https://i.stack.imgur.com/w49yv.jpg" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/w49yv.jpg" data-original="https://i.stack.imgur.com/w49yv.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>


