php - Can str_replace be safely used on a UTF-8 encoded string if it's only given valid UTF-8 encoded strings as arguments?

Question

Welcome To Ask or Share your Answers For Others

php - Can str_replace be safely used on a UTF-8 encoded string if it's only given valid UTF-8 encoded strings as arguments?

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:00:05+0000

Yes. UTF-8 is deliberately designed to allow this and other similar non-Unicode-aware processing.

In UTF-8, any non-ASCII byte sequence representing a valid character always begins with a byte in the range xC0-xFF. This byte may not appear anywhere else in the sequence, so you can't make a valid UTF-8 sequence that matches part of a character.

This is not the case for older multibyte encodings, where different parts of a byte sequence are indistinguishable. This caused a lot of problems, for example trying to replace an ASCII backslash in a Shift-JIS string (where byte x5C might be the second byte of a character sequence representing something else).

Categories

php - Can str_replace be safely used on a UTF-8 encoded string if it's only given valid UTF-8 encoded strings as arguments?

php - Can str_replace be safely used on a UTF-8 encoded string if it's only given valid UTF-8 encoded strings as arguments?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags