Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
981 views
in Technique[技术] by (71.8m points)

dplyr - How to fix the consistency of columns in R?

There is a data frame need to fix the consistency of the two columns. I used this code to get the uniques values of three columns (pop1_number, pop2_number, pop3_number):

y <- x %>%
  group_by(PROVINCE, DISTRICT, SUB_DISTRI, VILLAGE) %>% 
  summarise(number_1 = unique(pop1_NUMBER),
            number_2 = unique(pop2_NUMBER),
            number_3 = unique(pop3_number))

I got this error

Error: Problem with `summarise()` input `number_2`.
x Input `number_2` must be size 38 or 1, not 39.
i An earlier column had size 38.
i Input `number_2` is `unique(pop3_NUMBER)`.
i The error occurred in group 5333: PROVINCE = 11, DISTRICT = 15, SUB_DISTRI = 10, VILLAGE = 37

Then got each of these three columns unique values by this code:

pop1_number  <- x %>% 
  group_by(PROVINCE, DISTRICT, SUB_DISTRI, VILLAGE) %>% 
  summarise(number = unique(pop1_NUMBER)

The uniqueness of them was different. For pop1_number was 106444, for pop2_number was 106474, and for pop3_number was 106456. I need to be the uniqueness of pop2_number and pop3_number 106444 too. This is what I am looking to fix. Then I checked those rows of the data frame, some of the row values swapped. As you see on the screenshot, row 137 values must be in row 138 and row 138 must be in 137. I stuck here on how to fix it. ![image|690x118](upload://dEYyn4wHHM4fRrqhTpv8XNv3hj2.png) Thank you

The should look like this one

df <- data.frame(ID = c(1:50),
                 PROVINCE = rep(11),
                 DISTRICt = rep(1:5, 10), 
                 SUB_DISTRCT = c(rep(10, 5), rep(20, 10), rep(30, 5), rep(4, 5), rep(5, 5), rep(3, 5), rep(2,5), rep(1,5), rep(6,5)),
                VILLAGE = c(rep(1, 15), rep(2,10),rep(3,10), rep(4,10),rep(5,5)),
                 pop1_NUMBER = c(rep(1,10), rep(2,10), rep(3,10), rep(4, 6), rep(5,6), rep(7,8)),
                 pop2_NUMBER = c(rep(40,6),rep(50,4),rep(60,5),rep(70,5),rep(80,5),rep(90,5),rep(101,4),rep(102,2), rep(103,4), rep(104,2),rep(105,4), rep(106,4)),
                pop3_NUMBER = c(rep(200, 3), rep(201,3), rep(202,4), rep(203,5), rep(204,5), rep(205,5), rep(206,5), rep(207,5), rep(208,4), rep(209,3), rep(210,4),
                                rep(211, 4)))```
question from:https://stackoverflow.com/questions/65840555/how-to-fix-the-consistency-of-columns-in-r

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...