Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
208 views
in Technique[技术] by (71.8m points)

php - Documentation for ?: in regex?

A while ago, I saw in regex (at least in PHP) you can make a capturing group not capture by prepending ?:.

Example

$str = 'big blue ball';
$regex = '/b(ig|all)/';
preg_match_all($regex, $str, $matches);
var_dump($matches);

Outputs...

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(3) "big"
    [1]=>
    string(4) "ball"
  }
  [1]=>
  array(2) {
    [0]=>
    string(2) "ig"
    [1]=>
    string(3) "all"
  }
}

In this example, I don't care about what was matched in the parenthesis, so I appended the ?: ('/b(?:ig|all)/') and got output

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(3) "big"
    [1]=>
    string(4) "ball"
  }
}

This is very useful - at least I think so. Sometimes you just don't want to clutter your matches with unnecessary values.

I was trying to look up documentation and the official name for this (I call it a non capturing group, but I think I've heard it before).

Being symbols, it seemed hard to Google for.

I have also looked at a number of regex reference guides, with no mention.

Being prefixed with ?, and appearing in the first chars inside parenthesis would leave me to believe it has something to do with lookaheads or lookbehinds.

So, what is the proper name for these, and where can I learn more?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's available on the Subpatterns page of the official documentation.

The fact that plain parentheses fulfill two functions is not always helpful. There are often times when a grouping subpattern is required without a capturing requirement. If an opening parenthesis is followed by "?:", the subpattern does not do any capturing, and is not counted when computing the number of any subsequent capturing subpatterns. For example, if the string "the white queen" is matched against the pattern the ((?:red|white) (king|queen)) the captured substrings are "white queen" and "queen", and are numbered 1 and 2. The maximum number of captured substrings is 99, and the maximum number of all subpatterns, both capturing and non-capturing, is 200.

It's also good to note that you can set options for the subpattern with it. For example, if you want only the sub-pattern to be case insensitive, you can do:

(?i:foo)bar

Will match:

  • foobar
  • Foobar
  • FoObar
  • ...etc

But not

  • fooBar
  • FooBAR
  • ...etc

Oh, and while the official documentation doesn't actually explicitly name the syntax, it does refer to it later on as a "non-capturing subpattern" (which makes complete sense, and is what I would call it anyway, since it's not really a "group", but a subpattern)...


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...