Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
496 views
in Technique[技术] by (71.8m points)

regex - Remove the letters between two patterns of strings in R

How can I remove the letters between two specific patterns in R?

For instance

a= "a#g abcdefgtdkfef_jpg>pple"

I would like to remove all the letters between #g and jpg>

a1="apple"

I tried to find some function in stringr but I couldn't

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There's no need to load a package for this operation. You can use the base R function sub. It's used to match the first occurrence of a regular expression.

a <- "a#g abcdefgtdkfef_jpg>pple"
sub("#g.*jpg>", "", a)
# [1] "apple"

Regular expression explained:

  • #g matches "#g"
  • .* matches any character except (zero or more times)
  • jpg> matches "jpg>"

So here we're removing everything starting at #g up to and including jpg>


In regards to your comment

I tried to find some function in stringR but I couldn't

It's actually spelled stringr (case-sensitive). You could use str_replace.

library(stringr)
str_replace(a, "#g.*jpg>", "")
# [1] "apple"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...