Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
420 views
in Technique[技术] by (71.8m points)

javascript - Why does string.split with a regular expression that contains a capturing group return an array that ends with an empty string?

I'd like to split an input string on the first colon that still has characters after it on the same line.

For this, I am using the regular expression /:(.+)/

So given the string

aaa:
bbb:ccc

I'd expect an output of

["aaa:
bbb", "ccc"]

And given the string

aaa:bbb:ccc

I'd expect an output of

["aaa", "bbb:ccc"]

Yet when I actually run these commands, I get

["aaa:
bbb", "ccc", ""]
["aaa", "bbb:ccc", ""]

As output.

So somehow, javascript is adding an empty string to the end of the array.

I have checked the documentation for String.split and whilst it does mention that if you perform string.split on an empty string with a specified separator, you'll get an array with 1 empty string in it (and not empty array). It makes no mention of there always being an empty string in the output, or a warning that you may get this result if you make a common mistake or something.

I'd understand if my input string had a colon at the end or something like that; then it splits at the colon and the rest of the match is empty string. That's the issue mentioned in Splitting string with regular expression to make it array without empty element - but I don't have this issue, as my input string does not end with my separator.

I know a quick solution in my case will be to just simply limit the amount of matches, via "aaa:bbb:ccc".split(/:(.+)/, 2), but I'm still curious:

Why does this string.split call return an array ending with an empty string?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If we change the regex to /:.+/ and perform a split on it you get:

["aaa", ""]

This makes sense as the regex is matching the :bbb:ccc. And gives you the same output, if you were to manually split that string.

>>> 'aaa:bbb:ccc'.split(':bbb:ccc')
['aaa', '']

Adding the capture group in just saves the bbb:ccc, but shouldn't change the original split behaviour.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...