I would like to be able to plot each of "X1 by grpA", "X2 by grpA", "X3 by grpB", "X1 by grpB", "X2 by grpB", and "x3 by grpB" using ggplot2::ggplot() in conjunction with a for loop.
So far, I can get it to almost work, but the argument for the column of the grouping variable in the facet_grid() function does not resolve correctly when I try to use tidy_eval properties. It does work, however, when I type the column name explicitly, but of course, having to type the name explicitly would make it so I would not be able to dynamically change the grouping variable.
I provide the following data-set returned by the following code snippet to give context to my question:
set.seed(1)
dfr <- tibble(x1 = factor(sample(letters[1:7], 50, replace = T), levels=letters[1:7]),
x2 = factor(sample(letters[1:7], 50, replace = T), levels=letters[1:7]),
x3 = factor(sample(letters[1:7], 50, replace = T), levels=letters[1:7]),
grpA = factor(sample(c("grp1","grp2"),50, prob=c(0.3, 0.7) ,replace=T), levels = c("grp1", "grp2")),
grpB = factor(sample(c("grp1","grp2"),50, prob=c(0.6, 0.4) ,replace=T), levels = c("grp1", "grp2"))
)
head(df)
I also provide a function that creates the plotting data I need to make the grouped plots. It accepts strings as arguments for the parameters 'groupvar' and 'mainvar':
plot_data_prepr <- function(dat, groupvar, mainvar){
groupvar <- sym(groupvar)
mainvar <- sym(mainvar)
plot_data <- dat %>%
group_by(!!groupvar) %>%
count(!!mainvar, .drop = F) %>% drop_na() %>%
mutate(pct = n/sum(n),
pct2 = ifelse(n == 0, 0.005, n/sum(n)),
grp_tot = sum(n),
pct_lab = paste0(format(pct*100, digits = 1),'%'),
pct_pos = pct2 + .02)
return(plot_data)
}
here is normal usage of the function:
plot_data_prepr(dat = dfr, groupvar = "grpA", mainvar = "x1")
Now I share my for loop that fails when I try to use tidy_eval in the facet_grid() function in the context of ggplot(); the returned error = "Error in !sgvar : invalid argument type"
"FAILING EXAMPLE:"
for (i in seq_along(names(dfr)[1:3])){
mvar <- names(dfr)[i]
print(mvar)
gvar <- names(dfr[4])
print(gvar)
smvar <- sym(mvar)
sgvar <- sym(gvar)
plot <- ggplot(data=plot_data_prepr(dfr, gvar, mvar),
mapping = aes(x=!!smvar, y = pct2, fill = !!smvar)) +
geom_bar(stat = 'identity') +
ylim(0,1) +
geom_text(aes(x=!!smvar, label=pct_lab, y = pct_pos + .02)) +
facet_grid(. ~ !!sgvar) +
ggtitle(paste0(mvar," by ",gvar))
print(plot)
}
When I run the loop by explicitly typing grpA
in place of !!sgvar
in the facet_grid() function, it works for some reason:
"FUNCTIONING BUT NOT WHAT I WANT EXAMPLE:"
for (i in seq_along(names(dfr)[1:3])){
mvar <- names(dfr)[i]
print(mvar)
gvar <- names(dfr[4])
print(gvar)
smvar <- sym(mvar)
sgvar <- sym(gvar)
plot <- ggplot(data=plot_data_prepr(dfr, gvar, mvar),
mapping = aes(x=!!smvar, y = pct2, fill = !!smvar)) +
geom_bar(stat = 'identity') +
ylim(0,1) +
geom_text(aes(x=!!smvar, label=pct_lab, y = pct_pos + .02)) +
facet_grid(. ~ grpA) +
ggtitle(paste0(mvar," by ",gvar))
print(plot)
}
Of course, if I wanted to loop through a set of grouping variables, then needing to explicitly type each one would not allow for looping. Could someone explain why my code with the 'bang bang' operator inside facet_gric() doesn't work properly in the 'FAILING EXAMPLE' and also suggest how to remedy this error?
Thank you.
See Question&Answers more detail:
os