Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
391 views
in Technique[技术] by (71.8m points)

r - 根据R中之前的特定单词和之后的%符号提取字符串或值(Extract a string or value based on specific word before and a % sign after in R)

I have a Text column with thousands of rows of paragraphs, and I want to extract the values of " Capacity > x% ".

(我有一个包含数千行段落的Text列,我想提取“ Capacity > x% ”的值。)

The operation sign can be >,<,=, ~... I basically need the operation sign and integer value (eg <40%) and place it in a column next to the it, same row.

(操作符号可以是>,<,=, ~...我基本上需要操作符号和整数值(例如<40%),并将其放在它旁边的同一行中。)

I have tried, removing before/after text, gsub, grep , grepl, string_extract , etc. None with good results.

(我已经尝试过,删除文本, gsub, grepgrepl, string_extract等之前/之后。无,效果很好。)

I am not sure if the percentage sign is throwing it or I am just not getting the code structure.

(我不确定百分号是否在抛出它,或者我只是没有得到代码结构。)

Appreciate your assistance please.

(请感谢您的协助。)

Here are some codes I have tried (aa is the df, TEXT is col name):

(这是我尝试过的一些代码(aa是df,TEXT是col名称):)

str_extract(string =aa$TEXT, pattern = perl("(?<=LVEF).*(?=%)"))

gsub(".*[Capacity]([^.]+)[%].*", "\1", aa$TEXT)

genXtract(aa$TEXT, "Capacity", "%")

gsub("%.*$", "%", aa$TEXT)

grep("^Capacity.*%$",aa$TEXT)
  ask by Shawn translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since you did not provide a reproducible example, I created one myself and used it here.

(由于您没有提供可复制的示例,因此我自己创建了一个示例,并在此处使用了它。)

We can use sub to extract everything after "Capacity" until a number and % sign.

(我们可以使用sub提取"Capacity"之后的所有内容,直到数字和%符号为止。)

sub(".*Capacity(.*\d+%).*", "\1", aa$TEXT)
#[1] " > 10%"  " < 40%"  " ~ 230%"

Or with str_extract

(或与str_extract)

stringr::str_extract(aa$TEXT, "(?<=Capacity).*\d+%")

data

(数据)

aa <- data.frame(TEXT = c("This is a temp text, Capacity > 10%", 
                    "This is a temp text, Capacity < 40%", 
                    "Capacity ~ 230% more text  ahead"), stringsAsFactors = FALSE)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...