formatting - Error when subsetting based on adjusted values of different data frame in R

Question

Welcome To Ask or Share your Answers For Others

formatting - Error when subsetting based on adjusted values of different data frame in R

posted Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

formatting - Error when subsetting based on adjusted values of different data frame in R

I am asking a side-question about the method I learned here from @redmode :

Subsetting based on values of a different data frame in R

When I try to dynamically adjust the level I want to subset by:

N <- nrow(A)
cond <- sapply(3:N, function(i) sum(A[i,] > 0.95*B[i,])==2)
rbind(A[1:2,], subset(A[3:N,], cond))

I get an error

Error in FUN(left, right) : non-numeric argument to binary operator.

Can you think of a way I can get rows pertaining to values in A that are greater than 95% of the value in B? Thank you.

Here is code for A and B to play with.

A <- structure(list(name1 = c("trt", "0", "1", "10", "1", "1", "10"
), name2 = c("ctrl", "3", "1", "1", "1", "1", "10")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")
B <- structure(list(name1 = c("trt", "0", "1", "1", "1", "1", "9.4"), 
    name2 = c("ctrl", "3", "1", "10", "1", "1", "9.4")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2022-01-31T07:21:35+0000

You have some serious formatting issues with your data.

First, columns should be of the same data type, rows should be observations. (not always true, but a very good way to start) Here you have a row called cond, then a row called hour, then a series of classifications I'm guessing. The way you're data is presented to begin with doesn't make much sense and doesn't lend itself to easy manipulation of your data. But all is not lost. This is what I would do:

Reorganize my data:

C <- data.frame(matrix(as.numeric(unlist(A)), ncol=2)[-(1:2), ])

colnames(C) <- c('A.trt', 'A.cntr')
rownames(C) <- LETTERS[1:nrow(C)]

D <- data.frame(matrix(as.numeric(unlist(B)), ncol=2)[-(1:2), ])

colnames(D) <- c('B.trt', 'B.cntr')

(df <- cbind(C, D))

Which gives:

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# B    10      1   1.0   10.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4

Then you're problem is easily solved by:

df[which(df[, 1] > 0.95*df[, 3] & df[, 2] > 0.95*df[, 4]), ]

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4

Categories

formatting - Error when subsetting based on adjusted values of different data frame in R

formatting - Error when subsetting based on adjusted values of different data frame in R

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags