
Question:
I found a plot in a stats book, which I want to reproduce with the base package.
The plot looks like this:
<a href="https://i.stack.imgur.com/etIoq.png" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/etIoq.png" data-original="https://i.stack.imgur.com/etIoq.png" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>
So far I have the plot, but I have problems to add a centred labels to each part of the bar.
My code looks like this:
data <- sample( 5, 10 , replace = TRUE )
colors <- c('yellow','violet','green','pink','red')
relative.frequencies <- as.matrix( prop.table( table( data ) ) )
bc <- barplot( relative.frequencies, horiz = TRUE, axes = FALSE, col = colors )
Answer1:For your given example, we can do (<strong>all readers can skip this part and jump to the next</strong>):
set.seed(0) ## `set.seed` for reproducibility
dat <- sample( 5, 10 , replace = TRUE )
colors <- c('yellow','violet','green','pink')
h <- as.matrix( prop.table( table( dat ) ) )
## compute x-location of the centre of each bar
H <- apply(h, 2L, cumsum) - h / 2
## add text to barplot
bc <- barplot(h, horiz = TRUE, axes = FALSE, col = colors )
text(H, bc, labels = paste0(100 * h, "%"))
<a href="https://i.stack.imgur.com/CwbiW.jpg" rel="nofollow"><img alt="strip" class="b-lazy" data-src="https://i.stack.imgur.com/CwbiW.jpg" data-original="https://i.stack.imgur.com/CwbiW.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>
<hr />For all readers<strong>I will now construct a comprehensive example for you to digest the idea.</strong>
<strong>Step 1: generate a toy matrix of percentage for experiment</strong>
## a function to generate `n * p` matrix `h`, with `h > 0` and `colSums(h) = 1`
sim <- function (n, p) {
set.seed(0)
## a positive random matrix of 4 rows and 3 columns
h <- matrix(runif(n * p), nrow = n)
## rescale columns of `h` so that `colSums(h)` is 1
h <- h / rep(colSums(h), each = n)
## For neatness we round `h` up to 2 decimals
h <- round(h, 2L)
## but then `colSums(h)` is not 1 again
## no worry, we simply reset the last row:
h[n, ] <- 1 - colSums(h[-n, ])
## now return this good toy matrix
h
}
h <- sim(4, 3)
# [,1] [,2] [,3]
#[1,] 0.43 0.31 0.42
#[2,] 0.13 0.07 0.40
#[3,] 0.18 0.30 0.04
#[4,] 0.26 0.32 0.14
<strong>Step 2: understand a stacked bar-chart and get "mid-height" of each stacked bar</strong>
For stacked bar-chart, the height of the bar is the cumulative sum of each column of h
:
H <- apply(h, 2L, cumsum)
# [,1] [,2] [,3]
#[1,] 0.43 0.31 0.42
#[2,] 0.56 0.38 0.82
#[3,] 0.74 0.68 0.86
#[4,] 1.00 1.00 1.00
We now shift back h / 2
to get the mid / centre of each stacked bar:
H <- H - h / 2
# [,1] [,2] [,3]
#[1,] 0.215 0.155 0.21
#[2,] 0.495 0.345 0.62
#[3,] 0.650 0.530 0.84
#[4,] 0.870 0.840 0.93
<strong>Step 3: producing a bar-chart with filled numbers</strong>
For a vertical bar-chart, H
above gives the y
coordinate of the centre of each stacked bar. The x
coordinate is returned by barplot
(invisibly). Be aware, that we need to <strong>replicate</strong> each of x
's element nrow(H)
times when using text
:
x <- barplot(h, col = 1 + 1:nrow(h), yaxt = "n")
text(rep(x, each = nrow(H)), H, labels = paste0(100 * h, "%"))
<a href="https://i.stack.imgur.com/KGWQf.jpg" rel="nofollow"><img alt="vertical barchart" class="b-lazy" data-src="https://i.stack.imgur.com/KGWQf.jpg" data-original="https://i.stack.imgur.com/KGWQf.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>
For a horizontal bar-chart, H
above gives the x
coordinate of the centre of each stacked bar. The y
coordinate is returned by barplot
(invisibly). Be aware, that we need to <strong>replicate</strong> each of y
's element nrow(H)
times when using text
:
y <- barplot(h, col = 1 + 1:nrow(h), xaxt = "n", horiz = TRUE)
text(H, rep(y, each = nrow(H)), labels = paste0(100 * h, "%"))
<a href="https://i.stack.imgur.com/L5S9q.jpg" rel="nofollow"><img alt="Horizontal bar-chart" class="b-lazy" data-src="https://i.stack.imgur.com/L5S9q.jpg" data-original="https://i.stack.imgur.com/L5S9q.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>
Answer2:Here is another solution using mapply
:
invisible(mapply(function(k, l) text(x = (k - l/2), y = bc,
labels = paste0(l*100, "%"), cex = 1.5),
cumsum(relative.frequencies), relative.frequencies))
mapply
is a multivariate version of sapply
. In this case, it takes two inputs: cumsum(relative.frequencies)
and relative.frequencies
and applies the text()
function using those two vectors. x =
is the coordinates of the labels which takes each cumulative sum minus half of each corresponding relative.frequencies
. relative.frequencies
is then used again as the labels to be plotted.
The invisible()
function suppresses the printing of outputs into the console.
<a href="https://i.stack.imgur.com/w49yv.jpg" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/w49yv.jpg" data-original="https://i.stack.imgur.com/w49yv.jpg" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>