Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
275 views
in Technique[技术] by (71.8m points)

r - apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

probably a stupid question but I clearly can't see it and would appreciate your help.

Here is a fictional dataset:

dat <- data.frame(ID = c(101, 202, 303, 404),
                  var1 = c(1, NA, 0, 1),
                  var2 = c(NA, NA, 0, 1))

now I need to create a variable that sums the values up, per subject. The following works but ignores when var1 and var2 are NA:

try1 <- apply(dat[,c(2:3)], MARGIN=1, function(x) {sum(x==1, na.rm=TRUE)})

I would like the script to write NA if both var1 and var2 are NA, but if one of the two variables has an actual value, I'd like the script to treat the NA as 0. I have tried this:

check1 <- apply(dat[,2:3], MARGIN=1, function(x) 
{ifelse(x== is.na(dat$var1) & is.na(dat$var2), NA, {sum(x==1, na.rm=TRUE)})})

This, however, produces a 4x4 matrix (int[1:4,1:4]). The real dataset has hundreds of observations so that just became a mess...Does anybody see where I go wrong?

Thank you!

question from:https://stackoverflow.com/questions/65939225/apply-with-ifelse-statement-and-is-na-does-not-sum-but-outputs-matrix-where

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's a working version:

apply(dat[,2:3], MARGIN=1, function(x) 
  {
    if(all(is.na(x))) {
      NA
    } else {
      sum(x==1, na.rm=TRUE)
    }
  }
)
#[1]  1 NA  0  2

Issues with yours:

  • Inside your function(x), x is the var1 and var2 values for a particular row. You don't want to go back and reference dat$var1 and dat$var2, which is the whole column! Just use x.
  • x== is.na(dat$var1) & is.na(dat$var2) is strange. It's trying to check whether x is the same as is.na(dat$var1)?
  • For a given row, we want to check whether all the values are NA. ifelse is vectorized and will return a vector - but we don't want a vector, we want a single TRUE or FALSE indicating whether all values are NA. So we use all(is.na()). And if() instead of ifelse.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...