python - Regexp to remove specific number of occurrences of character only

Question

Welcome To Ask or Share your Answers For Others

python - Regexp to remove specific number of occurrences of character only

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Regexp to remove specific number of occurrences of character only

In Python re, I have long strings of text with > character chunks of different lengths. One string can have 3 consecutive > chars in the middle, >> in the beginning, or any such combination.

I want to write a regexp that, after splitting the string based on spaces, iterates through each word to only identify those regions with exactly 2 occurrences >>, and I can't be sure if it's at the beginning, middle or end of the whole string, or what characters are before or after it, or if it's even the only 2 characters in the string.

So far I could come up with:

word = re.sub(r'>{2}', '', word)

This ends up removing all occurrences of 2 or more. What regular expression would work for this requirement? Any help is appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:42:27+0000

You need to make sure there is no character of your choice both on the left and right using a pair of lookaround, a lookahead and a lookbehind. The general scheme is

(?<!X)X{n}(?!X)

where (?<!X) means no X immediately on the left is allowed, X{n} means n occurrences of X, and (?!X) means no X immediately on the right is allowed.

In this case, use

r'(?<!>)>{2}(?!>)'

See the regex demo.

Categories

python - Regexp to remove specific number of occurrences of character only

python - Regexp to remove specific number of occurrences of character only

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags