Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
307 views
in Technique[技术] by (71.8m points)

r - Grouping with custom geom fails - how to inspect internal object from draw_panel()

This is a question related to a custom geom which is modified from this answer. The given geom failed with grouping, so I included coord_munch in draw_panel, much inspired by both GeomLine and GeomPath. It works actually in many cases, but I feel it fails similarly often.

In particular, it seems to fail with groups of two (see example below), and it weirdly fails with certain plots when using patchwork. I opened an issue, but haven't got a reply yet, which I am not quite surprised about, and I agree and feel that this is actually a problem of a poorly written geom, rather than a patchwork problem.

I believe the grouping (in the code, this is marked with ## Work out grouping variables for grobs) used for GeomPath fails for this grob, but I don't know how to inspect the munch object which is created in between.

My main question is, how can I inspect this object?

And if someone sees and understands the issue with my geom, I'd be even more grateful. Cheers

Example:

library(tidyverse)

## this is not an arrange problem, as shown by the correct plot using geom_path
testdf <- testdf %>% arrange(id, group, x) 

Works with geom_path

ggplot(testdf, aes(x, y)) +
  geom_path(aes(group = id))

Fails with geom_trail

ggplot(testdf, aes(x, y)) +
  geom_trail(aes(group = id))

Even worse when using colors

ggplot(testdf, aes(x, y)) +
  geom_trail(aes(group = id, color = group))

Created on 2020-07-02 by the reprex package (v0.3.0)

GeomTrail

geom_trail <-
  function (mapping = NULL, data = NULL, stat = "identity", position = "identity",
            na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...) {
    layer(data = data, mapping = mapping, stat = stat, geom = GeomTrail,
          position = position, show.legend = show.legend, inherit.aes = inherit.aes,
          params = list(na.rm = na.rm, ...))
  }

GeomTrail <- ggplot2::ggproto(
  "GeomTrail", ggplot2::GeomPoint,
  
  default_aes = ggplot2::aes(
    shape = 19, colour = "black", size = 1.5, fill = NA, alpha = NA, stroke = 0.5,
    linesize = 0.5, linetype = 1, gap = .9,
  ),
  
  ## tjebo: 
  ## here is a function handle_na(), which does have no effect on the problem
  
  draw_panel = function(data, panel_params, coord, arrow = NULL,
                        lineend = "butt", linejoin = "round", linemitre = 10,
                        na.rm = FALSE) {
    if (!anyDuplicated(data$group)) {
      message_wrap("geom_path: Each group consists of only one observation. ",
                   "Do you need to adjust the group aesthetic?")
    }
    
    
    # ggplot: 
    ##must be sorted on group
    data <- data[order(data$group), , drop = FALSE]
    munched <- coord_munch(coord, data, panel_params)
    
    # ggplot: 
    ##Default geom point behaviour
    if (is.character(data$shape)) {
      data$shape <- translate_shape_string(data$shape)
    }
    coords <- coord$transform(data, panel_params)
    
    if (unique(coords$size == 0)) {
      my_points <- NULL
    } else {
      my_points <- grid::pointsGrob(
        coords$x,
        coords$y,
        pch = coords$shape,
        gp = grid::gpar(
          col = alpha(coords$colour, coords$alpha),
          fill = alpha(coords$fill, coords$alpha),
          fontsize = coords$size * .pt + coords$stroke * .stroke / 2,
          lwd = coords$stroke * .stroke / 2
        )
      )
    }
    
    # ggplot: 
    ##Silently drop lines with less than two points, preserving order
    rows <- stats::ave(seq_len(nrow(munched)), munched$group, FUN = length)
    munched <- munched[rows >= 2, ]
    if (nrow(munched) < 2) {
      return(zeroGrob())
    }
    
    ## tjebo: 
    ## here, ggplot2:::dapply() checks which grob to use (segment or lines), 
    ## but it also does not seem to have an effect, or at least I don't know 
    ## to change the grob in this case
    
    # teunbrand: 
    # New behaviour
    ## Convert x and y to units
    x <- unit(munched$x, "npc")
    y <- unit(munched$y, "npc")
    
    ## Work out grouping variables for grobs 
    n <- nrow(munched)
    group_diff <- munched$group[-1] != munched$group[-n]
    start <- c(TRUE, group_diff)
    end <- c(group_diff, TRUE)
    
    ## teunbrand: Custom grob class
    my_path <- grid::grob(
      x = x, y = y,
      mult = munched$gap * .pt,
      name = "trail",
      gp = grid::gpar(
        col = alpha(munched$colour, munched$alpha)[!end], # this could also be [start]
        fill = alpha(munched$colour, munched$alpha)[!end],
        lwd = munched$linesize * .pt,
        lty = munched$linetype,
        lineend = "butt",
        linejoin = "round",
        linemitre = 10
      ),
      vp = NULL,
      cl = "trail"
    )

    ggplot2:::ggname(
      "geom_trail",
      grid::grobTree(my_path, my_points)
    )
  }
)

# not modified hook
makeContent.trail <- function(x){ 
  # Convert npcs to absolute units
  x_new <- grid::convertX(x$x, "mm", TRUE)
  y_new <- grid::convertY(x$y, "mm", TRUE)
  
  # Do trigonometry stuff
  hyp <- sqrt(diff(x_new)^2 + diff(y_new)^2)
  sin_plot <- diff(y_new) / hyp
  cos_plot <- diff(x_new) / hyp
  
  diff_x0_seg <- head(x$mult, -1) * cos_plot
  diff_x1_seg <- (hyp - head(x$mult, -1)) * cos_plot
  diff_y0_seg <- head(x$mult, -1) * sin_plot
  diff_y1_seg <- (hyp - head(x$mult, -1)) * sin_plot
  
  x0 = head(x_new, -1) + diff_x0_seg
  x1 = head(x_new, -1) + diff_x1_seg
  y0 = head(y_new, -1) + diff_y0_seg
  y1 = head(y_new, -1) + diff_y1_seg
  keep <- unclass(x0) < unclass(x1)
  
  # Remove old xy coordinates
  x$x <- NULL
  x$y <- NULL
  
  # Supply new xy coordinates
  x$x0 <- unit(x0, "mm")[keep]
  x$x1 <- unit(x1, "mm")[keep]
  x$y0 <- unit(y0, "mm")[keep]
  x$y1 <- unit(y1, "mm")[keep]
  
  # Set to segments class
  class(x)[1] <- 'segments'
  x
}

data

testdf <- tibble(
  id = c("A", "B", "B", "C", "D", "A", "E", "E", "F", "F", "G", "H", "I", "J", "I", "J", "K", "L", "M", "N", "M", "O", "P", "Q", "R", "R", "S", "T", "S", "T"),
  group = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c", "c", "d", "d", "d", "d", "d", "d", "e", "e", "e", "e", "e", "e"),
  x = c(41, 43, 45, 45, 45, 46, 41, 46, 53, 54, 54, 56, 35, 35, 37, 37, 44, 44, 43, 44, 45, 45, 46, 46, 44, 48, 50, 52, 53, 54),
  y = structure(c(2.2, 1.8, 1.8, 2.3, 2.2, 2.2, 5.3, 2.3, 4.6, 4.6, 4.8, 4.8, 3.9, 4.1, 3.9, 4.1, 3.6, 3.7, 2.8, 2.6, 2.8, 3.1, 3.1, 2.9, 0.7, 0.7, 1, 0.8, 1, 0.8), .Names = c("", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""))
)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A couple of things to note here

Debugging ggproto methods

Here are my three favourite debugging methods:

If you're writing ggproto's yourself, you can use RStudio's browser, or superassign objects from the code. Bonus: you can superassign from the debug screen.

GeomDummy <- ggproto(
  "GeomDummy", Geom,
  draw_panel = function(...) { # Doesn't really matter

   # If in RStudio, put this somewhere in the beginning
   browser()

   {...} # Useful code here

   # Superassign data to be debugged to global environment
   debugdata <<- problemdata
  }
)

If debugging immutable code (such as ggplot's own code, unless you forked it), you can still debug it with the browser, but it takes some effort to follow the right paths to get to problematic code:

debugonce(ggplot2:::ggplot_build.ggplot)
# The above is triggered whenever a plot is build before drawing
ggplot(mtcars, aes(wt, mpg)) + geom_point()

You can also debug(ggplot2:::ggplot_build.ggplot), but you'll have to undebug() when done.

Spotted improvements

In the following bits:

if (!anyDuplicated(data$group)) {
    message_wrap("geom_path: Each group consists of only one observation. ",
                 "Do you need to adjust the group aesthetic?")
}
{...}
if (nrow(munched) < 2) {
  return(zeroGrob())
}

This will draw nothing at all, even if there is 1 point to be drawn that doesn't need a segment to connect to itself.

In the code below:

if (unique(coords$size == 0)) {
  my_points <- NULL
}

Typically one would use shape = NA to omit drawing points, but it is not for me to decide how you should write your own geoms. Also, I never seen the if(unique(x == y)) pattern before, but wouldn't this throw a warning if there is both a TRUE case and a FALSE case? It might be useful to replace that with if (all(coords$size == 0)).

That said, the entire conditional point drawing can be reduced to the following structure:

GeomTrail <- ggproto(
  "GeomTrail", GeomPoint,
  draw_panel = function(self, ...usual_arguments...) { # Important to include self
    # Default geom point behaviour
    my_points <- ggproto_parent(GeomPoint, self)$draw_panel(
      data, panel_params, coord, na.rm = na.rm
    )

  {..rest of code goes here..}

  },
  non_missing_aes = c("size", "colour") # Omit shape here
)

Improved ggproto / grid code

The main thing I did was change (x,y) parametrisation to ([x0,x1],[y0,y1]) parametrisation which is used by geom_segments(), that makes the other calculations in the grid code easier to follow too.

Also I switched from makeContent() to makeContext(), because for reasons beyond my understanding the colours wouldn't update.

GeomTrail <- ggplot2::ggproto(
  "GeomTrail", ggplot2::GeomPoint,
  
  default_aes = ggplot2::aes(
    shape = 19, colour = "black", size = 1.5, fill = NA, alpha = NA, stroke = 0.5,
    linesize = 0.5, linetype = 1, gap = .9,
  ),
  
  ## tjebo: 
  ## here is a function handle_na(), which does have no effect on the problem
  
  draw_panel = function(data, panel_params, coord, arrow = NULL,
                        lineend = "butt", linejoin = "round", linemitre = 10,
                        na.rm = FALSE) {
    if (!anyDuplicated(data$group)) {
      message_wrap("geom_path: Each group consists of only one observation. ",
                   "Do you need to adjust the group aesthetic?")
    }
    
    
    # ggplot: 
    ##must be sorted on group
    data <- data[order(data$group), , drop = FALSE]
    
    # ggplot: 
    ##Default geom point behaviour
    if (is.character(data$shape)) {
      data$shape <- translate_shape_string(data$shape)
    }
    coords <- coord$transform(data, panel_params)
    
    if (unique(coords$size == 0)) {
      my_points <- NULL
    } else {
      my_points <- grid::pointsGrob(
        coords$x,
        coords$y,
        pch = coords$shape,
        gp = grid::gpar(
          col = alpha(coords$colour, coords$alpha),
          fill = alpha(coords$fill, coords$alpha),
          fontsize = coords$size * .pt + coords$stroke * .stroke / 2,
          lwd = coords$stroke * .stroke / 2
        )
      )
    }
    
    data <- coord_munch(coord, data, panel_params)
    
    data <- transform(data,
                      xend = c(tail(x, -1), NA),
                      yend = c(tail(y, -1), NA),
                      keep = c(group[-1] == head(group, -1), FALSE))
    data <- data[data$keep, ]
    
    ## Make custom grob class
    my_path <- grid::grob(
      x0 = unit(data$x, "npc"), x1 = unit(data$xend, "npc"),
      y0 = unit(data$y, "npc"), y1 = unit(data$yend, "npc"),
      mult = data$gap * .pt,
      name = "pointpath",
      gp = grid::gpar(
        col = alpha(data$colour, data$alpha),
        lwd = (data$linesize * .pt),
        lty = data$linetype,
        lineend = "butt",
        linejoin = "round", linemitre = 10
      ),
      vp = NULL,
      ### Now this is the important bit:
      cl = "trail"
    )
    
    ggplot2:::ggname(
      "geom_trail",
      grid::grobTree(my_path, my_points)
    )
  }
)

makeContext.trail <- function(x) {
  # Convert npcs to absolute units
  x0 <- grid::convertX(x$x0, "mm", TRUE)
  y0 <- grid::convertY(x$y0, "mm", TRUE)
  x1 <- grid::convertX(x$x1, "mm", TRUE)
  y1 <- grid::convertY(x$y1, "mm", TRUE)
  
  # Do trigonometry stuff
  dx <- x1 - x0
  dy <- y1 - y0
  hyp <- sqrt(dx ^ 2 + dy ^ 2)
  nudge_y <- (dy / hyp) * x$mult
  nudge_x <- (dx / hyp) * x$mult
  
  # Calculate new positions
  x0 <- x0 + nudge_x
  x1 <- x1 - nudge_x
  y0 <- y0 + nudge_y
  y1 <- y1 - nudge_y
  
  # Filter overshoot
  keep <- (sign(dx) == sign(x1 - x0)) & (sign(dy) == sign(y1 - y0))
  x$gp[] <- lapply(x$gp, function(x) {
    if (length(x) == 1L) return(x) else x[keep]
  })
  
  # Supply new xy coordinates
  x$x0 <- unit(x0[keep], "mm")
  x$x1 <- unit(x1[keep], "mm")
  x$y0 <- unit(y0[keep], "mm")
  x$y1 <- unit(y1[keep], "mm")
  
  # Set to segments class
  x$mult <- NULL
  x$id <- NULL
  class(x)[1] <- "segments"
  x
}

End result

It now plots like this:

ggplot(testdf, aes(x, y)) +
    geom_trail(aes(group = id, color = group))

enter image description here

Note:

I didn't actually come up with this on the fly to answer a SO question, I recently had to deal with very similar problems with my own version of this geom.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...