Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
541 views
in Technique[技术] by (71.8m points)

regex - Why are there so many different regular expression dialects?

I'm wondering why there have to be so many regular expression dialects. Why does it seem like so many languages, rather then reusing a tried and true dialect, seem bent on writing their own.

Like these.

I mean, I understand that some of these do have very different backends. But shouldn't that be abstracted from the programmer?

I'm more referring to the odd but small differences, like where parentheses have to be escaped in one language, but are literals in another. Or where meta-characters mean somewhat different things.

Is there any particular reason we can't have some sort of universal dialect for regular expressions? I would think it would make things much easier for programmers who have to work in multiple languages.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Because regular expressions only have three operations:

  • Concatenation
  • Union |
  • Kleene closure *

Everything else is an extension or syntactic sugar, and so has no source for standardization. Things like capturing groups, backreferences, character classes, cardinality operations, etc are all additions to the original definition of regular expressions.

Some of these extensions make "regular expressions" no longer regular at all. They are able to decide non-regular languages because of these extras, but we still call them regular expressions regardless.

As people add more extensions, they will often try to use other, common variations of regular expressions. That's why nearly every dialect uses X+ to mean "one or many Xs", which itself is just a shortcut for writing XX*.

But when new features get added, there's no basis for standardization, so someone has to make something up. If more than one group of designers come up with similar ideas at around the same time, they'll have different dialects.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...