r - Subset a data frame based on value pairs stored in independent ordered vectors

Question

Welcome To Ask or Share your Answers For Others

r - Subset a data frame based on value pairs stored in independent ordered vectors

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Subset a data frame based on value pairs stored in independent ordered vectors

I have an R dataframe that I need to subset data from. The subsetting will be based on two columns in the dataframe. For example:

A <- c(1,2,3,3,5,1)
B <- c(6,7,8,9,8,8)
Value <- c(9,5,2,1,2,2)
DATA <- data.frame(A,B,Value)

This is how DATA looks

I want those rows of data for which (A,B) combination is (1,6) and (3,8). These pairs are stored as individual (ordered) vectors of A and B:

AList <- c(1,3)
BList <- c(6,8)

Now, I am trying to subset the data basically by comparing if A column is present in AList AND B column is present in BList

DATA[(DATA$A %in% AList & DATA$B %in% BList),]

The subsetted result is shown below. In addition to the value pairs (1,6) and (3,8) I am also getting (1,8). Basically, this filter has given me value pairs for all combinations in AList and BList. How do I restrict it to just (1,6) and (3,8)?

This is my desired result:

A B Value
1 6     9
3 8     2

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:30:42+0000

This is a job for merge:

KEYS <- data.frame(A = AList, B = BList)
merge(DATA, KEYS)

#   A B Value
# 1 1 6     9
# 2 3 8     2

Edit: after the OP expressed his preference for a logical vector in the comments below, I would suggest one of the following.

Use merge:

df.in.df <- function(x, y) {
  common.names <- intersect(names(x), names(y))
  idx <- seq_len(nrow(x))
  x <- x[common.names]
  y <- y[common.names]
  x <- transform(x, .row.idx = idx)
  idx %in% merge(x, y)$.row.idx
}

or interaction:

df.in.df <- function(x, y) {
  common.names <- intersect(names(x), names(y))
  interaction(x[common.names]) %in% interaction(y[common.names])
}

In both cases:

df.in.df(DATA, KEYS)
# [1] TRUE FALSE  TRUE FALSE FALSE FALSE

Categories

r - Subset a data frame based on value pairs stored in independent ordered vectors

r - Subset a data frame based on value pairs stored in independent ordered vectors

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags