Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
278 views
in Technique[技术] by (71.8m points)

dplyr - Using R Separate_Rows doesn't work with a "|"

Have a CSV file which has a column which has a variable list of items separated by a |.

I use the code below:

violations <- inspections %>% head(100) %>% 
  select(`Inspection ID`,Violations) %>% 
  separate_rows(Violations,sep = "|")

but this only creates a new row for each character in the field (including spaces)

What am I missing here on how to separate this column?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's hard to help without a better description of your data and an example of what the correct output would look like. That said, I think part of your confusion is due to the documentation in separate_rows. A similar function, separate, documents its sep argument as:

If character, sep is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values.

but the documentation for the sep argument in separate_rows doesn't say the same thing though I think it has the same behavior. In regular expressions, | has special meaning so it must be escaped as \|.

df <- tibble(
  Inspection_ID = c(1, 2, 3),
  Violations = c("A", "A|B", "A|B|C"))
separate_rows(df, Violations, sep = "\|")

Yields

# A tibble: 6 x 2
  Inspection_ID Violations
          <dbl> <chr>     
1             1 A         
2             2 A         
3             2 B         
4             3 A         
5             3 B         
6             3 C      

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...