Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.4k views
in Technique[技术] by (71.8m points)

ggplot2 - Labeling Outliers of Boxplots in R

I have the code that creates a boxplot, using ggplot in R, I want to label my outliers with the year and Battle.

Here is my code to create my boxplot

require(ggplot2)
ggplot(seabattle, aes(x=PortugesOutcome,y=RatioPort2Dutch ),xlim="OutCome", 
y="Ratio of Portuguese to Dutch/British ships") + 
geom_boxplot(outlier.size=2,outlier.colour="green") + 
stat_summary(fun.y="mean", geom = "point", shape=23, size =3, fill="pink") + 
ggtitle("Portugese Sea Battles")

Can anyone help? I knew this is correct, I just want to label the outliers.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The following is a reproducible solution that uses dplyr and the built-in mtcars dataset.

Walking through the code: First, create a function, is_outlier that will return a boolean TRUE/FALSE if the value passed to it is an outlier. We then perform the "analysis/checking" and plot the data -- first we group_by our variable (cyl in this example, in your example, this would be PortugesOutcome) and we add a variable outlier in the call to mutate (if the drat variable is an outlier [note this corresponds to RatioPort2Dutch in your example], we will pass the drat value, otherwise we will return NA so that value is not plotted). Finally, we plot the results and plot the text values via geom_text and an aesthetic label equal to our new variable; in addition, we offset the text (slide it a bit to the right) with hjust so that we can see the values next to, rather than on top of, the outlier points.

library(dplyr)
library(ggplot2)

is_outlier <- function(x) {
  return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}

mtcars %>%
  group_by(cyl) %>%
  mutate(outlier = ifelse(is_outlier(drat), drat, as.numeric(NA))) %>%
  ggplot(., aes(x = factor(cyl), y = drat)) +
    geom_boxplot() +
    geom_text(aes(label = outlier), na.rm = TRUE, hjust = -0.3)

Boxplot


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...