Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
357 views
in Technique[技术] by (71.8m points)

keyword - R remove multiple text strings in data frame

New to R. I am looking to remove certain words from a data frame. Since there are multiple words, I would like to define this list of words as a string, and use gsub to remove. Then convert back to a dataframe and maintain same structure.

wordstoremove <- c("ai", "computing", "ulitzer", "ibm", "privacy", "cognitive")

a
id                text time      username          
 1     "ai and x"        10     "me"          
 2     "and computing"   5      "you"         
 3     "nothing"         15     "everyone"     
 4     "ibm privacy"     0      "know"        

I was thinking something like:

a2 <- apply(a, 1, gsub(wordstoremove, "", a)

but clearly this doesnt work, before converting back to a data frame.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
wordstoremove <- c("ai", "computing", "ulitzer", "ibm", "privacy", "cognitive")

(dat <- read.table(header = TRUE, text = 'id text time username
1 "ai and x" 10 "me"
2 "and computing" 5 "you"
3 "nothing" 15 "everyone"
4 "ibm privacy" 0 "know"'))

#   id          text time username
# 1  1      ai and x   10       me
# 2  2 and computing    5      you
# 3  3       nothing   15 everyone
# 4  4   ibm privacy    0     know

(dat1 <- as.data.frame(sapply(dat, function(x) 
  gsub(paste(wordstoremove, collapse = '|'), '', x))))

#   id    text time username
# 1  1   and x   10       me
# 2  2    and     5      you
# 3  3 nothing   15 everyone
# 4  4            0     know

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...