Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
584 views
in Technique[技术] by (71.8m points)

r - Is there a function to add AOV post-hoc testing results to ggplot2 boxplot?

I'd like to add results of a Tukey.HSD post-hoc test to a ggplot2 boxplot. This SO answer contains a manual example of what I want (i.e., the letters on the plot were added manually; groups which share a letter are indistinguishable, p>whatever).

enter image description here

Is there an automatic function add letters like these to a boxplot, based on AOV and Tukey HSD post-hoc analyis?

I think it would not be too hard to write such a function. It would look something like this:

set.seed(0)
lev <- gl(3, 10)
y <- c(rnorm(10), rnorm(10) + 0.1, rnorm(10) + 3)
d <- data.frame(lev=lev, y=y)

p_base <- ggplot(d, aes(x=lev, y=y)) + geom_boxplot() 

a <- aov(y~lev, data=d)
tHSD <- TukeyHSD(a)

# Function to generate a data frame of factor levels and corresponding labels
generate_label_df <- function(HSD, factor_levels) {
  comparisons <- rownames(HSD$l)
  p.vals <- HSD$l[ , "p adj"]

  ## Somehow create a vector of letters
  labels <- # A vector of letters, one for each factor level, generated using `comparisons` and `p.vals`
  letter_df <- data.frame(lev=factor_levels, labels=labels)
  letter_df
}

# Add the labels to the plot
p_base + 
  geom_text(data=generate_label_df(tHSD), aes(x=l, y=0, label=labels))

I realize that the TukeyHSD object has a plot method, and there is another package (which I can't now seem to find) which does what I'm describing in base graphics, but I would really prefer to do this in ggplot2.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use 'multcompLetters' from the 'multcompView' package to generate letters of homologous groups after a Tukey HSD test. From there, it's a matter of extracting the group labels corresponding to each factor tested in the Tukey HSD, as well as the upper quantile as displayed in the boxplot in order to place the label just above this level.

library(plyr)
library(ggplot2)
library(multcompView)

set.seed(0)
lev <- gl(3, 10)
y <- c(rnorm(10), rnorm(10) + 0.1, rnorm(10) + 3)
d <- data.frame(lev=lev, y=y)

a <- aov(y~lev, data=d)
tHSD <- TukeyHSD(a, ordered = FALSE, conf.level = 0.95)

generate_label_df <- function(HSD, flev){
 # Extract labels and factor levels from Tukey post-hoc 
 Tukey.levels <- HSD[[flev]][,4]
 Tukey.labels <- multcompLetters(Tukey.levels)['Letters']
 plot.labels <- names(Tukey.labels[['Letters']])

 # Get highest quantile for Tukey's 5 number summary and add a bit of space to buffer between    
 # upper quantile and label placement
    boxplot.df <- ddply(d, flev, function (x) max(fivenum(x$y)) + 0.2)

 # Create a data frame out of the factor levels and Tukey's homogenous group letters
  plot.levels <- data.frame(plot.labels, labels = Tukey.labels[['Letters']],
     stringsAsFactors = FALSE)

 # Merge it with the labels
   labels.df <- merge(plot.levels, boxplot.df, by.x = 'plot.labels', by.y = flev, sort = FALSE)

return(labels.df)
}

Generate ggplot

 p_base <- ggplot(d, aes(x=lev, y=y)) + geom_boxplot() +
  geom_text(data = generate_label_df(tHSD, 'lev'), aes(x = plot.labels, y = V1, label = labels))

Boxplot with automatic Tukey HSD group label placement


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...