Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
133 views
in Technique[技术] by (71.8m points)

PCRE Regex: Is it possible to check within only the first X characters of a string for a match

PCRE Regex: Is it possible for Regex to check for a pattern match within only the first X characters of a string, ignoring other parts of the string beyond that point?

My Regex:

I have a Regex:

/S+Vs*/

This checks the string for non-whitespace characters whoich have a trailing 'V' and then a whitespace character or the end of the string.

This works. For example:

Example A:

 SEBSTI FMDE OPORV AWEN STEM students into STEM 

// Match found in 'OPORV' (correct)

Example B:

 ARKFE SSETE BLMI EDSF BRNT CARFR (name removed) Academy Networking Event 
      
//Match not found (correct).   

Re: The capitalised text each letter and the letters placement has a meaning in the source data. This is followed by generic info for humans to read ("Academy Networking Event", etc.)

My Issue:

It can theoretically occur that sometimes there are names that involve roman numerals such as:

Example C:

 ARKFE SSETE BLME CARFR Academy IV Networking Event 
      
//Match found (incorrect).  

I would like my Regex above to only check the first X characters of the string.

Can this be done in PCRE Regex itself? I can't find any reference to length counting in Regex and I suspect this can't easily be achieved. String lengths are completely arbitary. (We have no control over the source data).

Intention:

/S+Vs*/{check within first 25 characters only}
 ARKFE SSETE BLME CARFR Academy IV Networking Event 
                         ^
                         -  Cut off point. Not found so far so stop. 

//Match not found (correct).  

Workaround:

The Regex is in PHP and my current solution is to cut the string in PHP, to only check the first X characters, typically the first 20 characters, but I was curious if there was a way of doing this within the Regex without needing to manipulate the string directly in PHP?

$valueSubstring = substr($coreRow['value'],0,20); /* first 20 characters only */
$virtualCount = preg_match_all('/S+Vs*/',$valueSubstring); 
question from:https://stackoverflow.com/questions/65911290/pcre-regex-is-it-possible-to-check-within-only-the-first-x-characters-of-a-stri

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can find your pattern after X chars and skip the whole string, else, match your pattern. So, if X=25:

^.{25,}S+V.*(*SKIP)(*F)|S+Vs*

See the regex demo. Details:

  • ^.{25,}S+V.*(*SKIP)(*F) - start of string, 25 or more chars other than line break chars, as many as possible, then one or more non-whitespaces and V, and then the rest of the string, the match is failed and skipped
  • | - or
  • S+Vs* - match one or more non-whitespaces, V and zero or more whitespace chars.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...