Let's say I have a data frame containing a bunch of data and a date/time column indicating when each data point was collected. I have another data frame that lists time spans, where a "Start" column indicates the date/time when each span starts and an "End" column indicates the date/time when each span ends.
I've created a dummy example below using simplified data:
main_data = data.frame(Day=c(1:30))
spans_to_filter =
data.frame(Span_number = c(1:6),
Start = c(2,7,1,15,12,23),
End = c(5,10,4,18,15,26))
I toyed around with a few ways of solving this problem and ended up with the following solution:
require(dplyr)
filtered.main_data =
main_data %>%
rowwise() %>%
mutate(present = any(Day >= spans_to_filter$Start & Day <= spans_to_filter$End)) %>%
filter(present) %>%
data.frame()
This works perfectly fine, but I noticed it can take a while to process if I have a lot of data (I assume because I'm performing a row-wise comparison). I'm still learning the ins-and-outs of R and I was wondering if there is a more efficient way of performing this operation, preferably using dplyr/tidyr?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…