Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
675 views
in Technique[技术] by (71.8m points)

regex - Beginning and end of words in sed and grep

I don't understand the difference between and < in GNU sed and GNU grep. It seems to me can always replace < and \> without changing the set of matching strings.

More specifically, I am trying to find examples in which something and \< something do not match exactly the same strings.

Same question for something and something\>.

Thank you

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I suspect that it very rarely makes a difference whether you use (the more common) or (the more specific) < and >, but I can think of an example where it would. This is quite contrived, and I suspect that in most real-world regex use it wouldn't make a difference, but this should demonstrate that it at least could make a difference in some cases.

If I have the following text:

this is his pig

and I want to know if /is/ matches, it wouldn't matter if I instead used /<is>/ or I instead used />is</

But what if my text was instead

is this his pig

There's no longer a word-final boundary before the 'is', only a word-initial boundary. Using /is/ matches, and of course /<is>/ does too, but />is</ does not.

In real life, though, I think it is not common that you really need to be able to make this distinction, which is why (at least outside of sed) is the normal word boundary marker for regular expressions.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...