Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
679 views
in Technique[技术] by (71.8m points)

regex - R extract part of string

I have a question about extracting a part of a string. For example I have a string like this:

a <- "DP=26;AN=2;DB=1;AC=1;MQ=56;MZ=0;ST=5:10,7:2;CQ=SYNONYMOUS_CODING;GN=NOC2L;PA=1^1:0.720&2^1:0"

I need to extract everything between GN= and ;.So here it will be NOC2L.

Is that possible?

Note: This is INFO column form VCF file format. GN is Gene Name, so we want to extract gene name from INFO column.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try this:

sub(".*?GN=(.*?);.*", "\1", a)
# [1] "NOC2L"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...