Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
297 views
in Technique[技术] by (71.8m points)

Extract numeric part of strings of mixed numbers and characters in R

I have a lot of strings, and each of which tends to have the following format: Ab_Cd-001234.txt I want to replace it with 001234. How can I achieve it in R?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The stringr package has lots of handy shortcuts for this kind of work:

# input data following @agstudy
data <-  c('Ab_Cd-001234.txt','Ab_Cd-001234.txt')

# load library
library(stringr)

# prepare regular expression
regexp <- "[[:digit:]]+"

# process string
str_extract(data, regexp)

Which gives the desired result:

  [1] "001234" "001234"

To explain the regexp a little:

[[:digit:]] is any number 0 to 9

+ means the preceding item (in this case, a digit) will be matched one or more times

This page is also very useful for this kind of string processing: http://en.wikibooks.org/wiki/R_Programming/Text_Processing


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...