Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
308 views
in Technique[技术] by (71.8m points)

r - How to merge dataframes and simultanously recode NA's based on values of other rows?

I want to do something which should be relatively easy but I'm finding it impossible. (Relatively new into R).

I have these two datasets:

df1 <- data.frame("Basket" = c("W1", "W1"),          
             "Fruits" = c("apple", "banana"),
             "Color" = c("green", "yellow"),
             "Spanishfruit" = c("manzana", "platano"),
             "Spanishcolor" = c("verde", "amarillo"))

df <- data.frame("Basket" = c("W2", "W2", "W2"),          
              "Fruits" = c("apple", "banana", "grape"),
              "Color" = c("green", "yellow", "purple"),
              "Spanishfruit" = c(NA, NA, NA),
              "Spanishcolor" = c(NA, NA, NA))
             

First, I want to merge them (I'm using bind_rows for that), which is working. However, I also want to recode NA's in the second dataframe so that if the variables fruit and colour are equal between the datasets (and the variable basket is not), the variables "Spanishfruit" and "Spanishcolour" (simply its translations to Spanish) are equal. I.e., I only managed to arrive to this:

df3 <- data.frame("Round" = c("W1", "W1","W2", "W2", "W2" ),          
              "Fruits" = c("apple", "banana", "apple", "banana", "grape"),
              "Color" = c("green", "yellow", "green", "yellow", "purple"),
              "Spanishfruit" = c("manzana", "platano", NA, NA, NA),
              "Spanishcolor" = c("verde", "amarillo", NA, NA, NA)) 

But what I want is this.

df4 <- data.frame("Round" = c("W1", "W1","W2", "W2", "W2" ),          
              "Fruits" = c("apple", "banana", "apple", "banana", "grape"),
              "Color" = c("green", "yellow", "green", "yellow", "purple"),
              "Spanishfruit" = c("manzana", "platano", "manzana", "platano", NA),
              "Spanishcolor" = c("verde", "amarillo", "verde", "amarillo", NA))

I tried several things so far, mostly combining mutate and case_when. I'm thinking the solution might by in merging and establishing the right keys when merging them? I'm really lost and cannot find a solution to this problem anywhere. What I want is that if the values in the columns "fruits" and "color" are equal across one row from df1 and another row from df2, when I merge them, values in "Spanishfruit" and "Spanishcolor" in the rwo from df2 is equal to value in row from df1, while variable basket is still different. Or, put differently: if the values for two variables across rows are equal, I want that the values for two variables to be equal as well, while keeping another variable different.

I'll be really thankful if someone knows how to solve this. It'd be extra helpful to find a tidyverse-based solution, but everything would help.

Thanks!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Is this the output you are looking for?

library(tidyverse)

df1 <- data.frame("Basket" = c("W1", "W1"),          
               "Fruits" = c("apple", "banana"),
               "Color" = c("green", "yellow"),
               "Spanishfruit" = c("manzana", "platano"),
               "Spanishcolor" = c("verde", "amarillo"))

df <- data.frame("Basket" = c("W2", "W2", "W2"),          
              "Fruits" = c("apple", "banana", "grape"),
              "Color" = c("green", "yellow", "purple"),
              "Spanishfruit" = c(NA, NA, NA),
              "Spanishcolor" = c(NA, NA, NA))

df3 <-rbind(df, df1)

df3 %>% group_by(Fruits) %>%
        mutate(Spanishfruit = zoo::na.locf(Spanishfruit, na.rm = FALSE, fromLast = TRUE),
               Spanishcolor = zoo::na.locf(Spanishcolor, na.rm = FALSE, fromLast = TRUE)) %>%
       arrange(Basket, desc())
#> # A tibble: 5 x 5
#> # Groups:   Fruits [3]
#>   Basket Fruits Color  Spanishfruit Spanishcolor
#>   <fct>  <fct>  <fct>  <chr>        <chr>       
#> 1 W2     apple  green  manzana      verde       
#> 2 W2     banana yellow platano      amarillo    
#> 3 W2     grape  purple <NA>         <NA>        
#> 4 W1     apple  green  manzana      verde       
#> 5 W1     banana yellow platano      amarillo

Created on 2021-01-07 by the reprex package (v0.3.0)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...