Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
353 views
in Technique[技术] by (71.8m points)

na - How to find last column with value (for each row) in R?

Suppose there is a dataframe with several measurements, where some measurements are missing. Where the value is missing, all following measurements are missing too. How can the last measurement be found?

df <- data.frame(id = c(1, 2, 3, 4), m_1 = c('a', 'b', 'c', 'd'), m_2 = c('e', NA, 'g', 'h'), m_3 = c('i', NA, NA, 'l'))

df
    id   m_1   m_2   m_3
[1]  1     a     e     i
[2]  2     b  <NA>  <NA>
[3]  3     c     g  <NA>
[4]  4     d     h     l

There are two options any of which I would like to get.

df
    id   m_1   m_2   m_3    m
[1]  1     a     e     i  m_3
[2]  2     b  <NA>  <NA>  m_1
[3]  3     c     g  <NA>  m_2
[4]  4     d     h     l  m_3

df
    id   m_1   m_2   m_3   m
[1]  1     a     e     i   i
[2]  2     b  <NA>  <NA>   b
[3]  3     c     g  <NA>   g
[4]  4     d     h     l   l

I was trying to mix mutate with which, colnames and is.na, but it did not work out.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

An option is max.col from base R to get the column index for each row where there is a non-NA element. The ties.method can be "random", "first" or "last". As we want the last non-NA, specify the "last" as ties.method

df$m <- names(df)[-1][max.col(!is.na(df[-1]), 'last')]
df$m
#[1] "m_3" "m_1" "m_2" "m_3"

Or for the second option, cbind with row index and extract the elements

df[-1][cbind(seq_len(nrow(df)), max.col(!is.na(df[-1]), 'last'))]
#[1] "i" "b" "g" "l"

Or this can be done with tidyverse

library(dplyr)
df %>%
  rowwise %>% 
  mutate(m = {tmp <- c_across(starts_with('m'))
               tail(na.omit(tmp), 1)}) %>%
  ungroup

Or if we want to get both at once, then an option is to reshape to 'long' format

library(tidyr)
df %>% 
   pivot_longer(cols = starts_with('m'), values_drop_na = TRUE, 
        names_to = "m_name", values_to = 'm_value') %>% 
   group_by(id) %>%
   slice_tail(n = 1)%>%
   ungroup %>% 
   right_join(df) %>% 
   select(names(df), everything())

-output

# A tibble: 4 x 6
#     id m_1   m_2   m_3   m_name m_value
#  <dbl> <chr> <chr> <chr> <chr>  <chr>  
#1     1 a     e     i     m_3    i      
#2     2 b     <NA>  <NA>  m_1    b      
#3     3 c     g     <NA>  m_2    g      
#4     4 d     h     l     m_3    l      

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...